AI Document Processing Guide | Nagare Documentation

01How It Works

Nagare uses AI (powered by Vertesia) to automatically extract structured data from unstructured documents like GP capital statements, fund presentations, and quarterly reports.

Document Processing Flow

Upload Document

Get a signed upload URL and upload your PDF/Excel file

AI Processing

AI reads the document and extracts tables, numbers, dates

Structured Data

Returns JSON with fund parameters, transactions, etc.

Review & Import

Review extracted data and import into your portfolio

What Can Be Extracted?

From Capital Statements

• Capital calls
• Distributions
• NAV / Fair value
• Dates and quarters
• Currency

From Fund Presentations

• Fund name and vintage
• Total commitment
• Investment strategy
• Expected returns (TVPI/IRR)
• Fund life and deployment period

Accuracy Rate

AI extraction is typically 95-99% accurate for well-formatted documents. Always review extracted data before importing—it takes 30 seconds per fund and catches edge cases.

02Uploading Documents

Step 1: Get Upload URL

POST /api/v1/documents/upload-urlBRONZE

curl -X POST https://api.nagarehq.com/api/v1/documents/upload-url \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Q3-2024-Capital-Statement.pdf",
    "mime_type": "application/pdf"
  }'

Response includes a signed upload URL (valid for 1 hour):

Response: 200 OK

{
  "url": "https://storage.googleapis.com/...",
  "id": "obj_abc123def456",
  "mime_type": "application/pdf",
  "path": "uploads/Q3-2024-Capital-Statement.pdf"
}

Step 2: Upload File to Signed URL

Upload using curl or fetch()

curl -X PUT "https://storage.googleapis.com/..." \
  -H "Content-Type: application/pdf" \
  --data-binary @Q3-2024-Capital-Statement.pdf

Step 3: Create Document Object

After uploading the file, create a document object to trigger AI processing:

POST /api/v1/documentsBRONZE

curl -X POST https://api.nagarehq.com/api/v1/documents \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Q3 2024 Capital Statement",
    "description": "Sequoia Fund XVIII - Q3 2024",
    "sourceId": "obj_abc123def456",
    "fundId": "SEQUOIA-XVIII",
    "fundName": "Sequoia Capital Fund XVIII",
    "tags": ["capital-statement", "2024-Q3"],
    "properties": {
      "quarter": "Q3",
      "year": 2024
    }
  }'

Response: Document Created with Workflow Info

{
  "id": "doc_xyz789",
  "name": "Q3 2024 Capital Statement",
  "status": "processing",
  "workflow_id": "wf_parse_document",
  "workflow_run_id": "run_abc123",
  "createdAt": "2024-03-16T10:30:00Z"
}

03Tracking Processing Status

AI processing typically takes 10-60 seconds depending on document complexity. Track progress using the workflow status endpoint:

GET /api/v1/documents/:id/workflow-statusSAGE

curl -X GET https://api.nagarehq.com/api/v1/documents/doc_xyz789/workflow-status \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY"

Response: Workflow Status

{
  "documentId": "doc_xyz789",
  "status": "completed",
  "workflows": [
    {
      "workflow_id": "wf_parse_document",
      "run_id": "run_abc123",
      "status": "completed",
      "started_at": "2024-03-16T10:30:05Z",
      "completed_at": "2024-03-16T10:30:42Z"
    }
  ]
}

Status Values

pendingDocument created, waiting to process

processingAI is reading the document

completedData extracted successfully

failedProcessing error (check logs)

04Analyzing Fund Documents

For fund presentations or pitchbooks, use the analyze endpoint to extract structured fund parameters:

POST /api/v1/documents/analyze-fundBRONZE

curl -X POST https://api.nagarehq.com/api/v1/documents/analyze-fund \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "documentId": "doc_xyz789"
  }'

Returns structured fund data ready to import:

Response: Extracted Fund Data

{
  "success": true,
  "data": {
    "fund_name": "Sequoia Capital Fund XVIII",
    "fund_type": "VENTURE_CAPITAL",
    "vintage_year": 2024,
    "total_commitment": 1200000000,
    "currency": "USD",
    "investment_period_years": 5,
    "fund_life_years": 10,
    "expected_tvpi": 2.5,
    "expected_irr": 0.20,
    "management_fee": 0.02,
    "carried_interest": 0.20
  }
}

Use Case: Pre-Fill Fund Forms

The extracted data can be used to pre-fill the fund creation form in the UI, saving users from manually typing all parameters.

05Searching Documents

List Documents

GET /api/v1/documentsSAGE

# List all documents for current tenant
curl -X GET "https://api.nagarehq.com/api/v1/documents?limit=50" \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY"

# Filter by fund
curl -X GET "https://api.nagarehq.com/api/v1/documents?fundId=SEQUOIA-XVIII" \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY"

Advanced Search

Use the search endpoint for full-text search across document content:

POST /api/v1/documents/searchBRONZE

curl -X POST https://api.nagarehq.com/api/v1/documents/search \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "capital call Q3 2024",
    "fundId": "SEQUOIA-XVIII",
    "status": "completed",
    "limit": 20
  }'

Delete Document

DELETE /api/v1/documents/:idRED

curl -X DELETE https://api.nagarehq.com/api/v1/documents/doc_xyz789 \
  -H "Authorization: Bearer fnd_live_YOUR_API_KEY"

06Complete Workflow Example

Here's a complete example of uploading and processing a capital statement:

bash

#!/bin/bash
API_KEY="fnd_live_YOUR_API_KEY"
API_URL="https://api.nagarehq.com/api/v1"

# Step 1: Get upload URL
UPLOAD_RESPONSE=$(curl -s -X POST "$API_URL/documents/upload-url" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"name":"capital-statement.pdf","mime_type":"application/pdf"}')

UPLOAD_URL=$(echo $UPLOAD_RESPONSE | jq -r '.url')
FILE_ID=$(echo $UPLOAD_RESPONSE | jq -r '.id')

# Step 2: Upload file
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/pdf" \
  --data-binary @capital-statement.pdf

# Step 3: Create document
DOC_RESPONSE=$(curl -s -X POST "$API_URL/documents" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"name\":\"Q3 2024 Capital Statement\",
    \"sourceId\":\"$FILE_ID\",
    \"fundId\":\"SEQUOIA-XVIII\"
  }")

DOC_ID=$(echo $DOC_RESPONSE | jq -r '.id')

# Step 4: Poll for completion
while true; do
  STATUS=$(curl -s -X GET "$API_URL/documents/$DOC_ID/workflow-status" \
    -H "Authorization: Bearer $API_KEY" | jq -r '.status')

  echo "Status: $STATUS"

  if [ "$STATUS" = "completed" ]; then
    echo "Processing complete!"
    break
  elif [ "$STATUS" = "failed" ]; then
    echo "Processing failed"
    exit 1
  fi

  sleep 5
done

07Best Practices

Use Descriptive Names

Include fund name, quarter, and year in document names: "Sequoia-XVIII-Q3-2024-Capital-Statement" makes it easy to find documents later.

Tag Consistently

Use consistent tags like "capital-statement", "quarterly-report", "presentation" to make filtering easier. Tags are indexed for fast lookup.

Always Review Extracted Data

AI is 95-99% accurate, but always review before importing. Check dates, currencies, and decimal places—these are the most common extraction errors.

Keep Original Documents

Don't delete documents after extraction. Keep them for audit trails and to re-extract if needed. Storage is cheap, but losing source documents is expensive.

08Troubleshooting

Upload URL expired (403 error)

Cause: Upload URLs expire after 1 hour.

Solution: Get a new upload URL and try again. Upload immediately after receiving the URL.

Processing stuck at "processing" status

Cause: AI processing failed but status wasn't updated, or document is very large.

Solution: Wait 5 minutes for large documents (>50 pages). If still stuck, contact support with the document ID.

Extracted data is incorrect

Common causes: Scanned PDFs (not text-based), complex tables, non-standard formatting.

Solution: For scanned PDFs, run OCR first. For complex documents, manually enter data. Report persistent issues so AI can learn from them.

Document not found in search

Cause: Document hasn't finished processing, or it's in a different tenant.

Solution: Check workflow status. Verify you're searching the correct tenant. Documents are tenant-isolated for security.

Save 95% of Data Entry Time

Start uploading documents today and let AI handle the tedious work. Your team will thank you.

View API Reference Back to Docs Home

AI Document Processing

In This Guide