class_factory.concept_web package

class_factory.concept_web.ConceptWeb module

ConceptWeb Module

The ConceptWeb module provides tools to automatically extract, analyze, and visualize key concepts from lesson materials, helping to identify connections across topics and lessons. Central to this module is the ConceptMapBuilder class, which leverages a language model (LLM) to identify and structure important ideas and relationships from lesson readings and objectives into a graph-based representation.

Key functionalities of the module include:

Concept Extraction:
- Identifies key concepts from lesson readings and objectives using an LLM.
- Summarizes and highlights main themes from each lesson’s content.
Relationship Mapping:
- Extracts and maps relationships between identified concepts based on lesson objectives and content.
- Facilitates understanding of how topics interrelate within and across lessons.
Graph-Based Visualization:
- Constructs a concept map in which nodes represent concepts and edges represent relationships.
- Generates both interactive graph-based visualizations (HTML) and word clouds for key concepts.
Community Detection:
- Groups closely related concepts into thematic clusters.
- Helps identify broader themes or subtopics within the lesson materials.
Data Saving:
- Optionally saves intermediate data (concepts and relationships) as JSON files for further review or analysis.

Dependencies

This module depends on:

langchain_core: For LLM-based extraction and summarization tasks.
networkx: For graph generation and analysis of concept relationships.
matplotlib or plotly: For creating visualizations and word clouds.
Custom utilities for loading documents, extracting objectives, and handling logging.

Usage Overview

Initialize ConceptMapBuilder: - Instantiate ConceptMapBuilder with paths to project directories, reading materials, and the syllabus file.
Generate the Concept Map: - Use build_concept_map() to process lesson materials, extract and summarize concepts, map relationships, and generate visualizations.
Save and Review: - The generated concept map can be saved as an interactive HTML file or as a static word cloud for easier review and analysis.

Example

from class_factory.concept_web.ConceptMapBuilder import ConceptMapBuilder
from class_factory.utils.load_documents import LessonLoader
from langchain_openai import ChatOpenAI

# Set up paths and initialize components
syllabus_path = Path("/path/to/syllabus.docx")
reading_dir = Path("/path/to/lesson/readings")
project_dir = Path("/path/to/project")
llm = ChatOpenAI(api_key="your_api_key")

# Initialize the lesson loader and concept map builder
lesson_loader = LessonLoader(syllabus_path=syllabus_path, reading_dir=reading_dir, project_dir=project_dir)
concept_map_builder = ConceptMapBuilder(
    lesson_no=1,
    lesson_loader=lesson_loader,
    llm=llm,
    course_name="Sample Course",
    lesson_range=range(1, 5)
)

# Build and visualize the concept map
concept_map_builder.build_concept_map()

class class_factory.concept_web.ConceptWeb.ConceptMapBuilder(lesson_no: int, lesson_loader: LessonLoader, llm, course_name: str, output_dir: str | Path = None, lesson_range: range | int = None, lesson_objectives: List[str] | Dict[str, str] = None, verbose: bool = False, save_relationships: bool = False, **kwargs)[source]

Bases: BaseModel

Generate concept maps (a form of knowledge graph) from lesson materials, using a language model (LLM) to summarize content, extract relationships, and visualize concepts in a structured graph format.

This class provides end-to-end functionality for concept map creation, including loading readings, summarizing content, extracting concept relationships, constructing graphs, and generating interactive and visual outputs like word clouds.

lesson_no

Current lesson number being processed.

Type:: int

lesson_loader

Loader instance for handling lesson materials.

Type:: LessonLoader

llm

Language model instance for summarization and relationship extraction.

Type:: Any

course_name

Course name, used as context in LLM prompts.

Type:: str

output_dir

Directory for saving generated outputs.

Type:: Path

lesson_range

Range of lessons to process.

Type:: range

save_relationships

Whether to save extracted relationships to JSON.

Type:: bool

relationship_list

List of concept relationships.

Type:: List[Tuple[str, str, str]]

concept_list

List of unique concepts extracted.

Type:: List[str]

prompts

Dictionary of prompts for LLM tasks.

Type:: Dict[str, str]

verbose

Whether to enable verbose logging.

Type:: bool

G

Generated concept graph.

Type:: Optional[nx.Graph]

user_objectives

User-defined lesson objectives.

Type:: Dict[str, str]

load_and_process_lessons(threshold: float = 0.995): Loads lesson materials, summarizes content, and extracts relationships between concepts.

build_concept_map(directed

bool = False, concept_similarity_threshold: float = 0.995,: dark_mode: bool = True, lesson_objectives: Optional[Dict[str, str]] = None):

Runs the concept map generation pipeline and outputs visualizations.

build_concept_map(directed: bool = False, concept_similarity_threshold: float = 0.995, dark_mode: bool = True, lesson_objectives: Dict[str, str] | None = None) → None[source]

Execute the full pipeline to generate a concept map.

Parameters:

directed (bool, optional) – Whether to create a directed concept map. Defaults to False.
concept_similarity_threshold (float, optional) – Threshold for concept similarity. Defaults to 0.995.
dark_mode (bool, optional) – Whether to use dark mode for visualization. Defaults to True.
lesson_objectives (Optional[Dict[str, str]], optional) – User-provided lesson objectives. Defaults to None.

Raises:

ValueError – If any process encounters invalid data.

load_and_process_lessons(threshold: float = 0.995)[source]

Process lesson materials by summarizing content and extracting concept relationships.

Parameters:: threshold (float, optional) – Similarity threshold for extracted concepts. Defaults to 0.995.

For each lesson in lesson_range:

Load documents and objectives.
Summarize readings using the LLM.
Extract relationships between concepts and generates unique concept list.