> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mka1.com/llms.txt
> Use this file to discover all available pages before exploring further.

# GraphRAG evaluation

> A benchmark comparison of GraphRAG and traditional RAG on multi-hop retrieval questions.

This evaluation compares GraphRAG with traditional RAG on the same benchmark corpus and the same question set.
The goal was to measure whether graph-aware retrieval improved multi-hop question answering.

## What was implemented

The evaluated system did not use flat chunk retrieval alone.
It implemented GraphRAG with these stages:

1. Split source documents into chunks.
2. Extract entities and relationships from those chunks.
3. Build a knowledge graph from the extracted entities and relationships.
4. Run standard retrieval to get seed evidence for a user question.
5. Expand through graph links to collect connected evidence.
6. Re-rank the final evidence set before answer generation.

That is the GraphRAG behavior that was evaluated.

## What it was compared against

The comparison used a traditional RAG baseline with the same corpus and the same answer model.

The only difference between the two runs was retrieval mode:

* **Baseline RAG**: flat chunk retrieval only
* **GraphRAG**: graph-seeded expansion plus graph-aware reranking

This matters because it isolates the effect of GraphRAG itself.

## Benchmark design

The benchmark was designed to test multi-hop retrieval rather than simple one-chunk lookup.

It used:

* three target knowledge graphs
* nine semantically similar distractor graphs
* 108 short factual documents
* 24 questions that required linking facts across multiple chunks

This design is important.
If every answer already appears in one obvious chunk, GraphRAG will not show much benefit over standard RAG.

## Evaluation method

Both retrieval modes were run over the same benchmark corpus and the same question set.
Both then used the same answer-generation model and the same answer prompt.

The evaluation recorded three metrics:

* **Exact match**: whether the final answer exactly matched the gold answer
* **Token F1**: token overlap between the final answer and the gold answer
* **Evidence recall\@5**: how much of the required supporting evidence appeared in the top 5 retrieved chunks

## How the API was used

The evaluated API flow was simple.
The same store and the same documents were used for both the baseline and GraphRAG runs.
Only the query mode changed.

### 1. Create a GraphRAG store

<CodeGroup>
  ```bash CLI theme={null}
  mka1 search graphrag create-graph-RAG-store \
    --body '{
      "store_name": "benchmark_graphrag",
      "embedding_model": "auto",
      "extraction_model": "auto",
      "chunk_size": 800,
      "chunk_overlap": 120,
      "max_hops": 2
    }' \
    -H 'X-On-Behalf-Of: <end-user-id>'
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: "Bearer <mka1-api-key>",
      serverUrl: "https://apigw.mka1.com"
  );

  var res = await sdk.Search.Graphrag.CreateGraphRAGStoreAsync(
      xApiKeyId: "<id>",
      xUserId: "<end-user-id>",
      xExchangeJwtExternalUserId: "<end-user-id>",
      body: new CreateGraphRAGStoreRequest()
      {
          StoreName = "benchmark_graphrag",
          EmbeddingModel = "auto",
          ExtractionModel = "auto",
          ChunkSize = 800,
          ChunkOverlap = 120,
          MaxHops = 2,
      }
  );
  ```

  ```python Python SDK theme={null}
  from mka1 import SDK

  sdk = SDK(bearer_auth="Bearer YOUR_API_KEY")

  res = sdk.search.graphrag.create_graph_rag_store(
      store_name="benchmark_graphrag",
      embedding_model="auto",
      extraction_model="auto",
      chunk_size=800,
      chunk_overlap=120,
      max_hops=2,
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/search/graphrag/stores \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "store_name": "benchmark_graphrag",
      "embedding_model": "auto",
      "extraction_model": "auto",
      "chunk_size": 800,
      "chunk_overlap": 120,
      "max_hops": 2
    }'
  ```
</CodeGroup>

### 2. Ingest documents

<CodeGroup>
  ```bash CLI theme={null}
  mka1 search graphrag ingest-graph-RAG-documents \
    --store-name benchmark_graphrag \
    --body '{
      "documents": [
        {
          "document_id": "doc_contract_award",
          "text": "Rivera Logistics won the Northern Bridge Sensors contract.",
          "metadata": { "source": "benchmark" }
        },
        {
          "document_id": "doc_parent_company",
          "text": "Atlas Infrastructure Group owns Rivera Logistics.",
          "metadata": { "source": "benchmark" }
        }
      ]
    }'
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: "Bearer <mka1-api-key>",
      serverUrl: "https://apigw.mka1.com"
  );

  var res = await sdk.Search.Graphrag.IngestGraphRAGDocumentsAsync(
      new MeetKai.MKA1.Types.Requests.IngestGraphRAGDocumentsRequest
      {
          StoreName = "benchmark_graphrag",
          XApiKeyId = "<id>",
          XUserId = "<end-user-id>",
          XExchangeJwtExternalUserId = "<end-user-id>",
          Body = new IngestGraphRAGDocumentsRequest
          {
              Documents = new List<GraphRAGDocument>
              {
                  new GraphRAGDocument
                  {
                      DocumentId = "doc_contract_award",
                      Text = "Rivera Logistics won the Northern Bridge Sensors contract.",
                      Metadata = new Dictionary<string, object> { { "source", "benchmark" } },
                  },
                  new GraphRAGDocument
                  {
                      DocumentId = "doc_parent_company",
                      Text = "Atlas Infrastructure Group owns Rivera Logistics.",
                      Metadata = new Dictionary<string, object> { { "source", "benchmark" } },
                  },
              },
          },
      }
  );
  ```

  ```python Python SDK theme={null}
  res = sdk.search.graphrag.ingest_graph_rag_documents(
      store_name="benchmark_graphrag",
      documents=[
          {
              "document_id": "doc_contract_award",
              "text": "Rivera Logistics won the Northern Bridge Sensors contract.",
              "metadata": {"source": "benchmark"},
          },
          {
              "document_id": "doc_parent_company",
              "text": "Atlas Infrastructure Group owns Rivera Logistics.",
              "metadata": {"source": "benchmark"},
          },
      ],
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/search/graphrag/stores/benchmark_graphrag/documents \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "documents": [
        {
          "document_id": "doc_contract_award",
          "text": "Rivera Logistics won the Northern Bridge Sensors contract.",
          "metadata": {
            "source": "benchmark"
          }
        },
        {
          "document_id": "doc_parent_company",
          "text": "Atlas Infrastructure Group owns Rivera Logistics.",
          "metadata": {
            "source": "benchmark"
          }
        }
      ]
    }'
  ```
</CodeGroup>

### 3. Run the baseline RAG query

<CodeGroup>
  ```bash CLI theme={null}
  mka1 search graphrag query-graph-RAG-store \
    --store-name benchmark_graphrag \
    --body '{
      "query": "Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
      "mode": "baseline",
      "limit": 5,
      "seed_k": 8
    }'
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: "Bearer <mka1-api-key>",
      serverUrl: "https://apigw.mka1.com"
  );

  var res = await sdk.Search.Graphrag.QueryGraphRAGStoreAsync(
      new MeetKai.MKA1.Types.Requests.QueryGraphRAGStoreRequest
      {
          StoreName = "benchmark_graphrag",
          XApiKeyId = "<id>",
          XUserId = "<end-user-id>",
          XExchangeJwtExternalUserId = "<end-user-id>",
          Body = new GraphRAGQueryRequest
          {
              Query = "Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
              Mode = GraphRAGQueryRequestMode.Baseline,
              Limit = 5,
              SeedK = 8,
          },
      }
  );
  ```

  ```python Python SDK theme={null}
  res = sdk.search.graphrag.query_graph_rag_store(
      store_name="benchmark_graphrag",
      query="Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
      mode="baseline",
      limit=5,
      seed_k=8,
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/search/graphrag/stores/benchmark_graphrag/query \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "query": "Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
      "mode": "baseline",
      "limit": 5,
      "seed_k": 8
    }'
  ```
</CodeGroup>

### 4. Run the GraphRAG query

<CodeGroup>
  ```bash CLI theme={null}
  mka1 search graphrag query-graph-RAG-store \
    --store-name benchmark_graphrag \
    --body '{
      "query": "Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
      "mode": "graph",
      "limit": 5,
      "seed_k": 8
    }'
  ```

  ```csharp C# SDK theme={null}
  using MeetKai.MKA1;
  using MeetKai.MKA1.Types.Components;

  var sdk = new SDK(
      bearerAuth: "Bearer <mka1-api-key>",
      serverUrl: "https://apigw.mka1.com"
  );

  var res = await sdk.Search.Graphrag.QueryGraphRAGStoreAsync(
      new MeetKai.MKA1.Types.Requests.QueryGraphRAGStoreRequest
      {
          StoreName = "benchmark_graphrag",
          XApiKeyId = "<id>",
          XUserId = "<end-user-id>",
          XExchangeJwtExternalUserId = "<end-user-id>",
          Body = new GraphRAGQueryRequest
          {
              Query = "Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
              Mode = GraphRAGQueryRequestMode.Graph,
              Limit = 5,
              SeedK = 8,
          },
      }
  );
  ```

  ```python Python SDK theme={null}
  res = sdk.search.graphrag.query_graph_rag_store(
      store_name="benchmark_graphrag",
      query="Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
      mode="graph",
      limit=5,
      seed_k=8,
  )
  ```

  ```bash bash theme={null}
  curl https://apigw.mka1.com/api/v1/search/graphrag/stores/benchmark_graphrag/query \
    --request POST \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer <mka1-api-key>' \
    --header 'X-On-Behalf-Of: <end-user-id>' \
    --data '{
      "query": "Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?",
      "mode": "graph",
      "limit": 5,
      "seed_k": 8
    }'
  ```
</CodeGroup>

That last step is the key comparison.
The query, corpus, and answer model stayed the same.
Only `mode` changed:

* `baseline` = traditional flat retrieval
* `graph` = graph-seeded retrieval and reranking

## Measured results

The live benchmark run produced the following results:

| Method       | Exact Match | Token F1 | Evidence Recall\@5 |
| ------------ | ----------: | -------: | -----------------: |
| Baseline RAG |       25.0% |    27.1% |              55.9% |
| GraphRAG     |       62.5% |    62.5% |              71.9% |

Improvement:

* **Exact Match**: `+37.5` points
* **Token F1**: `+35.4` points
* **Evidence Recall\@5**: `+16.0` points

## Acceptance threshold

The benchmark used the following pass criteria:

* exact match improvement of at least `+5.0` points
* evidence recall\@5 improvement of at least `+10.0` points

The evaluated GraphRAG implementation passed both thresholds.

## Representative question-level outcomes

Examples where GraphRAG succeeded and baseline RAG did not:

* "Who is the chief financial officer of the company that owns the Northern Bridge Sensors contract winner?"
  * Baseline RAG: `unknown`
  * GraphRAG: `Javier Nanda`
* "Which company acquired the firm that prepared a risk report for Meridian Ports Authority?"
  * Baseline RAG: `unknown`
  * GraphRAG: `Atlas Infrastructure Group`
* "Which company owns the company that won the Delta Reach Sensors contract?"
  * Baseline RAG: `unknown`
  * GraphRAG: `Bluepeak Transit Group`

These are multi-hop questions.
They require linking facts across connected entities rather than retrieving a single directly matching chunk.

## Why GraphRAG performed better

Baseline RAG retrieved semantically similar chunks, but it sometimes failed to retrieve the connected evidence needed to complete the reasoning chain.

GraphRAG improved performance by:

* identifying the relevant seed entities from the question
* traversing graph relationships to find linked evidence
* reranking the final evidence set with graph signals in addition to semantic similarity

That is why the improvement appears most clearly on multi-hop questions.

## Summary

On this benchmark, GraphRAG outperformed traditional RAG on both final-answer accuracy and supporting-evidence retrieval.
The strongest gains appeared on questions that required linking facts across multiple connected entities.