class_factory.concept_web package
class_factory.concept_web.ConceptWeb module
ConceptWeb Module
The ConceptWeb module provides tools to automatically extract, analyze, and visualize key concepts from lesson materials, helping to identify connections across topics and lessons. Central to this module is the ConceptMapBuilder class, which leverages a language model (LLM) to identify and structure important ideas and relationships from lesson readings and objectives into a graph-based representation.
Key functionalities of the module include:
- Concept Extraction:
Identifies key concepts from lesson readings and objectives using an LLM.
Summarizes and highlights main themes from each lesson’s content.
- Relationship Mapping:
Extracts and maps relationships between identified concepts based on lesson objectives and content.
Facilitates understanding of how topics interrelate within and across lessons.
- Graph-Based Visualization:
Constructs a concept map in which nodes represent concepts and edges represent relationships.
Generates both interactive graph-based visualizations (HTML) and word clouds for key concepts.
- Community Detection:
Groups closely related concepts into thematic clusters.
Helps identify broader themes or subtopics within the lesson materials.
- Data Saving:
Optionally saves intermediate data (concepts and relationships) as JSON files for further review or analysis.
Dependencies
This module depends on:
langchain_core: For LLM-based extraction and summarization tasks.
networkx: For graph generation and analysis of concept relationships.
matplotlib or plotly: For creating visualizations and word clouds.
Custom utilities for loading documents, extracting objectives, and handling logging.
Usage Overview
Initialize ConceptMapBuilder: - Instantiate ConceptMapBuilder with paths to project directories, reading materials, and the syllabus file.
Generate the Concept Map: - Use build_concept_map() to process lesson materials, extract and summarize concepts, map relationships, and generate visualizations.
Save and Review: - The generated concept map can be saved as an interactive HTML file or as a static word cloud for easier review and analysis.
Example
from class_factory.concept_web.ConceptMapBuilder import ConceptMapBuilder
from class_factory.utils.load_documents import LessonLoader
from langchain_openai import ChatOpenAI
# Set up paths and initialize components
syllabus_path = Path("/path/to/syllabus.docx")
reading_dir = Path("/path/to/lesson/readings")
project_dir = Path("/path/to/project")
llm = ChatOpenAI(api_key="your_api_key")
# Initialize the lesson loader and concept map builder
lesson_loader = LessonLoader(syllabus_path=syllabus_path, reading_dir=reading_dir, project_dir=project_dir)
concept_map_builder = ConceptMapBuilder(
lesson_no=1,
lesson_loader=lesson_loader,
llm=llm,
course_name="Sample Course",
lesson_range=range(1, 5)
)
# Build and visualize the concept map
concept_map_builder.build_concept_map()
- class class_factory.concept_web.ConceptWeb.ConceptMapBuilder(lesson_no: int, lesson_loader: LessonLoader, llm, course_name: str, output_dir: str | Path = None, lesson_range: range | int = None, lesson_objectives: List[str] | Dict[str, str] = None, verbose: bool = False, save_relationships: bool = False, **kwargs)[source]
Bases:
BaseModel
Generate concept maps (a form of knowledge graph) from lesson materials, using a language model (LLM) to summarize content, extract relationships, and visualize concepts in a structured graph format.
This class provides end-to-end functionality for concept map creation, including loading readings, summarizing content, extracting concept relationships, constructing graphs, and generating interactive and visual outputs like word clouds.
- lesson_no
Current lesson number being processed.
- Type:
int
- lesson_loader
Loader instance for handling lesson materials.
- Type:
LessonLoader
- llm
Language model instance for summarization and relationship extraction.
- Type:
Any
- course_name
Course name, used as context in LLM prompts.
- Type:
str
- output_dir
Directory for saving generated outputs.
- Type:
Path
- lesson_range
Range of lessons to process.
- Type:
range
- save_relationships
Whether to save extracted relationships to JSON.
- Type:
bool
- relationship_list
List of concept relationships.
- Type:
List[Tuple[str, str, str]]
- concept_list
List of unique concepts extracted.
- Type:
List[str]
- prompts
Dictionary of prompts for LLM tasks.
- Type:
Dict[str, str]
- verbose
Whether to enable verbose logging.
- Type:
bool
- G
Generated concept graph.
- Type:
Optional[nx.Graph]
- user_objectives
User-defined lesson objectives.
- Type:
Dict[str, str]
- load_and_process_lessons(threshold
float = 0.995): Loads lesson materials, summarizes content, and extracts relationships between concepts.
- build_concept_map(directed
- bool = False, concept_similarity_threshold: float = 0.995,
dark_mode: bool = True, lesson_objectives: Optional[Dict[str, str]] = None):
Runs the concept map generation pipeline and outputs visualizations.
- build_concept_map(directed: bool = False, concept_similarity_threshold: float = 0.995, dark_mode: bool = True, lesson_objectives: Dict[str, str] | None = None) None [source]
Execute the full pipeline to generate a concept map.
- Parameters:
directed (bool, optional) – Whether to create a directed concept map. Defaults to False.
concept_similarity_threshold (float, optional) – Threshold for concept similarity. Defaults to 0.995.
dark_mode (bool, optional) – Whether to use dark mode for visualization. Defaults to True.
lesson_objectives (Optional[Dict[str, str]], optional) – User-provided lesson objectives. Defaults to None.
- Raises:
ValueError – If any process encounters invalid data.
- load_and_process_lessons(threshold: float = 0.995)[source]
Process lesson materials by summarizing content and extracting concept relationships.
- Parameters:
threshold (float, optional) – Similarity threshold for extracted concepts. Defaults to 0.995.
- For each lesson in lesson_range:
Load documents and objectives.
Summarize readings using the LLM.
Extract relationships between concepts and generates unique concept list.