graphdoc.prompts.schema_doc_quality module
- class graphdoc.prompts.schema_doc_quality.DocQualitySignature(*, database_schema: str, category: Literal['perfect', 'almost perfect', 'poor but correct', 'incorrect'], rating: Literal[4, 3, 2, 1])[source]
Bases:
Signature
You are a documentation quality evaluator specializing in GraphQL schemas. Your task is to assess the quality of documentation provided for a given database schema. Carefully analyze the schema’s descriptions for clarity, accuracy, and completeness. Categorize the documentation into one of the following ratings based on your evaluation: - perfect (4): The documentation is comprehensive and leaves no room for ambiguity in understanding the schema and its database content. - almost perfect (3): The documentation is clear and mostly free of ambiguity, but there is potential for further improvement. - poor but correct (2): The documentation is correct but lacks detail, resulting in some ambiguity. It requires enhancement to be more informative. - incorrect (1): The documentation contains errors or misleading information, regardless of any correct segments present. Such inaccuracies necessitate an incorrect rating. Provide a step-by-step reasoning to support your evaluation, along with the appropriate category label and numerical rating.
- database_schema: str
- category: Literal['perfect', 'almost perfect', 'poor but correct', 'incorrect']
- rating: Literal[4, 3, 2, 1]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- class graphdoc.prompts.schema_doc_quality.DocQualityDemonstrationSignature(*, database_schema: str, category: Literal['perfect', 'almost perfect', 'poor but correct', 'incorrect'], rating: Literal[4, 3, 2, 1])[source]
Bases:
Signature
You are evaluating the output of an LLM program, expect hallucinations. Given a GraphQL Schema, evaluate the quality of documentation for that schema and provide a category rating.
The categories are described as: - perfect (4): The documentation contains enough information so that the interpretation of the schema and its database content is completely free of ambiguity.
perfect (4) example: type Domain @entity {
“ The namehash (id) of the parent name. References the Domain entity that is the parent of the current domain. Type: Domain ” parent: Domain
}
- almost perfect (3): The documentation is almost perfect and free from ambiguity, but there is room for improvement.
almost perfect (3) example: type Token @entity {
“ Name of the token, mirrored from the smart contract ” name: String!
}
- poor but correct (2): The documentation is poor but correct and has room for improvement due to missing information. The documentation is not incorrect.
poor but correct (2) example: type InterestRate @entity {
“Description for column: id” id: ID!
}
- incorrect (1): The documentation is incorrect and contains inaccurate or misleading information. Any incorrect information automatically leads to an incorrect rating, even if some correct information is present.
incorrect (1) example: type BridgeProtocol implements Protocol @entity {
“ Social Security Number of the protocol’s main developer ” id: Bytes!
}
Output a number rating that corresponds to the categories described above.
- database_schema: str
- category: Literal['perfect', 'almost perfect', 'poor but correct', 'incorrect']
- rating: Literal[4, 3, 2, 1]
- model_config: ClassVar[ConfigDict] = {}
Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
- graphdoc.prompts.schema_doc_quality.doc_quality_factory(key: str | Signature | SignatureMeta) Signature | SignatureMeta [source]
Factory function to return the correct signature based on the key. Currently only supports two signatures (doc_quality and doc_quality_demo).
- Parameters:
key (Union[str, dspy.Signature]) – The key to return the signature for.
- Returns:
The signature for the given key.
- class graphdoc.prompts.schema_doc_quality.DocQualityPrompt(prompt: Literal['doc_quality', 'doc_quality_demo'] | Signature | SignatureMeta = 'doc_quality', prompt_type: Literal['predict', 'chain_of_thought'] | Callable = 'predict', prompt_metric: Literal['rating', 'category'] | Callable = 'rating')[source]
Bases:
SinglePrompt
DocQualityPrompt class for evaluating documentation quality.
This is a single prompt that can be used to evaluate the quality of the documentation for a given schema. This is a wrapper around the SinglePrompt class that implements the abstract methods.
- __init__(prompt: Literal['doc_quality', 'doc_quality_demo'] | Signature | SignatureMeta = 'doc_quality', prompt_type: Literal['predict', 'chain_of_thought'] | Callable = 'predict', prompt_metric: Literal['rating', 'category'] | Callable = 'rating') None [source]
Initialize the DocQualityPrompt.
- Parameters:
prompt (Union[str, dspy.Signature]) – The prompt to use. Can either be a string that maps to a defined signature, as set in the doc_quality_factory, or a dspy.Signature.
prompt_type (Union[Literal["predict", "chain_of_thought"], Callable]) – The type of prompt to use.
prompt_metric (Union[Literal["rating", "category"], Callable]) – The metric to use. Can either be a string that maps to a defined metric, as set in the doc_quality_factory, or a custom callable function. Function must have the signature (example: dspy.Example, prediction: dspy.Prediction) -> bool.
- evaluate_metric(example: Example, prediction: Prediction, trace=None) bool [source]
Evaluate the metric for the given example and prediction.
- Parameters:
example (dspy.Example) – The example to evaluate the metric on.
prediction (dspy.Prediction) – The prediction to evaluate the metric on.
trace (Any) – Used for DSPy.
- Returns:
The result of the evaluation. A boolean for if the metric is correct.
- Return type:
- format_metric(examples: List[Example], overall_score: float, results: List, scores: List) Dict[str, Any] [source]
Formats evaluation metrics into a structured report containing: - Overall score across all categories - Percentage correct per category - Detailed results for each evaluation
- Parameters:
examples (List[dspy.Example]) – The examples to evaluate the metric on.
overall_score (float) – The overall score across all categories.
results (List) – The results of the evaluation.
scores (List) – The scores of the evaluation.
- Returns:
A dictionary containing the overall score, per category scores, and details. { “overall_score”: 0, “per_category_scores”: {}, “details”: [], “results”: [] }
- Return type:
Dict[str, Any]