Prerequisites
- A DataExtractorAI account (sign up here if you don’t have one)
- An API key (get yours from the Dashboard)
- Choose your integration method:
- Direct API access with cURL (no additional requirements)
- Node.js 14+ (for the Node.js SDK)
- Any HTTP client library in your preferred language
Step 1: Choose Your Integration Method
You can either use our API directly with cURL/HTTP requests or install our SDK:
# Using npm
npm install dataextractorai
# Using yarn
yarn add dataextractorai
# Using pnpm
pnpm add dataextractorai
# Or use cURL directly with our API
# No installation needed!
For other platforms, check out our SDK Examples for more information or use cURL directly with our API Reference.
Step 2: Basic Usage with SDK
import { DataExtractorAI } from 'dataextractorai';
import fs from 'fs';
// Initialize the client with your API key
const extractor = new DataExtractorAI({
apiKey: 'YOUR_API_KEY'
});
// Define extraction schema
const invoiceSchema = {
type: 'object',
properties: {
invoice_number: { type: 'string' },
date: { type: 'string', format: 'date' },
total: { type: 'number' },
vendor: { type: 'string' }
}
};
// Extract data from a file
async function extractInvoiceData() {
try {
const result = await extractor.extract({
file: fs.createReadStream('invoice.pdf'),
schema: invoiceSchema
});
console.log('Extracted data:', result.data);
} catch (error) {
console.error('Extraction failed:', error.message);
}
}
extractInvoiceData();
Important: Always keep your API key secure and never expose it in client-side code.
Alternative: Using cURL
If you prefer to use the API directly, here are examples using cURL:
# Basic extraction with cURL
curl -X POST https://dataextractorai.com/api/v1/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F "[email protected]" \
-F 'schema={
"type": "object",
"properties": {
"invoice_number": { "type": "string" },
"date": { "type": "string", "format": "date" },
"total": { "type": "number" },
"vendor": { "type": "string" }
},
}'
# Using a template
curl -X POST https://dataextractorai.com/api/v1/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F "[email protected]" \
-F "templateId=invoice"
# With webhook for large documents
curl -X POST https://dataextractorai.com/api/v1/extract \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: multipart/form-data" \
-F "file=@large_document.pdf" \
-F "webhook_url=https://your-server.com/webhook" \
-F "webhook_events[]=completed" \
-F "webhook_events[]=failed"
Tip: For large documents or batch processing, we recommend using webhooks to avoid timeout issues. The webhook example above shows how to set this up.
Step 3: Web Integration
You can also use DataExtractorAI in browser environments:
Browser Integration Example
// For browser environments
import { DataExtractorAI } from 'dataextractorai';
const extractor = new DataExtractorAI({
apiKey: 'YOUR_API_KEY'
});
document.getElementById('extract-form').addEventListener('submit', async (event) => {
event.preventDefault();
const fileInput = document.getElementById('document-file');
if (!fileInput.files || fileInput.files.length === 0) {
alert('Please select a file');
return;
}
const file = fileInput.files[0];
try {
// Show loading state
document.getElementById('result').textContent = 'Processing...';
const result = await extractor.extract({
file: file,
schema: {
type: 'object',
properties: {
invoice_number: { type: 'string' },
date: { type: 'string' },
total: { type: 'number' }
}
}
});
// Display results
document.getElementById('result').textContent = JSON.stringify(result.data, null, 2);
} catch (error) {
document.getElementById('result').textContent = 'Error: ' + error.message;
}
});
<form id="extract-form">
<input type="file" id="document-file" accept=".pdf,.jpg,.png,.jpeg">
<button type="submit">Extract Data</button>
<pre id="result"></pre>
</form>
Next Steps
Now that you have the basics down, here are some next steps to get the most out of DataExtractorAI:
Choose Your Integration Method
Use our API directly with cURL/HTTP requests or integrate with our SDK - choose what works best for your workflow.
Learn about schema definition
Create custom schemas to extract exactly the data you need in the format you want. [Learn more][/schema/basic]
Explore API Reference
Check out our complete API reference for all available endpoints and options. View API Reference