Let me tell you a quick story first. Three months ago I was building an automation that pulled product data from a third-party REST API. The HTTP Request node fetched 200+ items in one shot — beautiful. Except the JSON response was a nightmare. Nested three levels deep, half the fields had null values, dates were in Unix timestamps, and the price field was a string with a currency symbol attached. My PostgreSQL node took one look at that data and basically said “bro, no.”
That one workflow taught me more about data cleaning in n8n than any tutorial ever could. And now, I’m passing all of that knowledge to you.
Today’s session is going to be a deep, practical guide on how to fetch APIs properly in n8n, handle all the weird edge cases, and clean your data so it’s ready for storage, AI analysis, or whatever downstream node is waiting for it. This isn’t a “click here, then click there” tutorial. We’re going to understand why things break and how to think like a developer when building n8n workflows.
For those who are new here — hello, I goes by the nickname axiomcompute, and welcome to the session. For the returning folks, you guys already know the vibe — let’s get straight into it.
Why APIs Are the Backbone of n8n Automation
Before we touch a single node, let me set the context. Everything in modern automation revolves around APIs. Your CRM has an API. Your payment gateway has an API. The weather forecast? API. Even that random meme generator you found at 2 AM? That’s an API too.
n8n has dedicated nodes for popular services — Slack, Google Sheets, Notion, etc. But here’s the reality — 80% of real-world integrations don’t have a dedicated node. They have an API documentation page and an endpoint URL. That’s it. And for those, you need the HTTP Request node.
The HTTP Request node is honestly the most important node in n8n after the Webhook. Master this, and you can connect n8n to literally anything that exists on the internet.
The HTTP Request Node: Deep Dive
Let’s break down this node — not just what buttons to click, but what’s actually happening under the hood.
Method Types — Picking the Right One
When you open the HTTP Request node, the first thing you see is the Method dropdown. This is where most beginners go wrong — they pick GET for everything or POST for everything without understanding what each method means.
| Method | What It Does | When to Use in n8n |
|---|---|---|
| GET | Fetches data from a server | Reading data — list of products, user profiles, weather info |
| POST | Sends data to create something new | Submitting forms, creating records, sending messages |
| PUT | Replaces entire resource | Updating a full record (overwrites everything) |
| PATCH | Updates specific fields | Changing just the status or a single field |
| DELETE | Removes a resource | Deleting a record from external service |
Pro Tip: Always check the API documentation for which method to use. I once used POST instead of PATCH on a CRM API and accidentally created 50 duplicate contacts instead of updating existing ones. Fun times.
URL Configuration
The URL field accepts both static and dynamic URLs. This is where n8n’s expression system becomes your best friend.
// Static URL
https://api.example.com/v1/products
// Dynamic URL using expressions
https://api.example.com/v1/products/{{ $json.product_id }}
// With query parameters built dynamically
https://api.example.com/v1/search?q={{ $json.search_term }}&limit=50Note:- If you’re dealing with special characters in your dynamic values (spaces, emojis, Indian language text), you’ll need to encode them. Use the encodeURIComponent() function inside expressions:
https://api.example.com/search?q={{ encodeURIComponent($json.search_term) }}I learned this the hard way when a user’s name contained “é” and the API returned a 400 error. Encoding fixes that instantly.
Authentication — The Part Everyone Struggles With
This is where 60% of beginners give up. The API works in Postman but breaks in n8n. I’ve been there. Let me save you those hours.
n8n supports multiple authentication types:
- Predefined Credential Type: If n8n has a built-in credential for the service (like OpenAI, Slack), always use this. It handles token refresh, base URLs, everything.
- Generic Header Auth: For APIs that need an
Authorization: Bearer your_tokenheader. Most common one you’ll use. - Basic Auth: Username and password. Some older APIs still use this.
- OAuth2: The complex one. For services like Google, Microsoft, etc. n8n handles the token refresh automatically once set up.
- Custom Auth: When the API has a non-standard auth method (like API key in query params, or custom headers).
My recommendation: Always use Predefined Credential Type if available. It’s pre-configured, tested, and handles edge cases like token expiry. Only go custom when there’s no built-in option.
Headers and Query Parameters
Beyond the URL and auth, sometimes you need to send specific headers. The most common ones:
Content-Type: application/json // Telling API you're sending JSON
Accept: application/json // Telling API you want JSON back
X-API-Version: 2024-01-01 // Some APIs use versioning in headers
User-Agent: n8n-workflow/1.0 // Some APIs require thisImportant:- Some APIs (especially Indian payment gateways like Razorpay, Cashfree) require specific headers for every request. Always read the docs carefully. I once spent 4 hours debugging a Cashfree integration only to realize I was missing a x-client-id header. The error message? “Unauthorized.” That’s it. No helpful hint. Just “Unauthorized.”
Handling Pagination
This is a big one. Most APIs don’t return all data at once. They paginate — give you 20, 50, or 100 records per page. If you need all records, you must handle pagination.
n8n’s HTTP Request node has a built-in Pagination option under the Options section. There are three modes:
- Offset-Based: The API uses
?offset=0&limit=50, then?offset=50&limit=50, etc. - Cursor-Based: The API returns a cursor/token for the next page. More modern APIs use this.
- Response Contains Link to Next Page: The API response itself includes the URL for the next page.
Example of offset-based pagination config:
// Under Options → Pagination
Mode: Offset
Limit per page: 50
Offset parameter: offset
Limit parameter: limit
Max pages: 10 // Safety limit to prevent infinite loopsPro Tip: Always set a Max Pages limit. I once forgot this, and my workflow fetched 47,000 records from an API in a loop. My Neon database almost cried, and my n8n instance became unresponsive for 15 minutes. Learn from my mistakes, yeah?
Understanding API Responses — The Messy Reality
Here’s what tutorials don’t show you — real API responses are messy. Let me show you what I mean. Take a typical e-commerce API response:
{
"status": "success",
"data": {
"products": [
{
"id": 1024,
"title": "Wireless Mouse ",
"price": "$20.00",
"stock": null,
"category": {
"id": 5,
"name": "Electronics",
"parent": { "id": 1, "name": "All Products" }
},
"created": 1702569600,
"tags": ["wireless", "mouse", "computer"],
"description": "<p>High quality <b>wireless</b> mouse</p>",
"variants": []
}
],
"pagination": {
"total": 234,
"page": 1,
"per_page": 50
}
}
}Look at the problems here:
- The actual data is nested inside
data.products— not at root level - Title has trailing whitespace:
"Wireless Mouse " - Price is a string with currency symbol:
"$20.00" stockisnull— will crash your calculations- Category is deeply nested — 3 levels deep
- Date is Unix timestamp, not human-readable
- Description has HTML tags mixed in
variantsis an empty array — might cause issues downstream
If you pass this raw data to a PostgreSQL Insert node, half your columns will be wrong, some will be null, and some will throw type mismatch errors. This is why data cleaning exists.
Data Cleaning in n8n: Three Approaches
Now we get to the main event. There are three ways to clean data in n8n, each with its own strengths. I use all three depending on the situation.
Approach 1: Expressions (Quick Fixes)
For simple transformations, you can use expressions directly in the field mappings of any node. No extra node needed.
// Remove whitespace
{{ $json.title.trim() }}
// Extract number from price string
{{ parseFloat($json.price.replace(/[^0-9.]/g, '')) }}
// Handle null with fallback
{{ $json.stock ?? 0 }}
// Access nested data
{{ $json.category.name }}
// Format Unix timestamp to ISO date
{{ new Date($json.created * 1000).toISOString() }}
// Join array to string
{{ $json.tags.join(', ') }}When to use expressions: When you need to fix 1-3 fields and the transformations are simple. Don’t overcomplicate it.
When NOT to use: When you’re transforming more than 5 fields, or when the logic involves conditions (if-else). At that point, switch to the Code node.
Approach 2: The Edit Fields (Set) Node
This node is highly underrated. It lets you restructure data completely — rename fields, set new values using expressions, drop fields you don’t need. No code required.
How I use it:
- Set the mode to “Manual Mapping”
- Add only the fields you want in the output
- Toggle “Keep Only Set Fields” — this removes all the junk
Example configuration:
// Field mappings in the Edit Fields node:
product_name → {{ $json.title.trim() }}
price_inr → {{ parseFloat($json.price.replace('$','').replace(',','')) }}
stock_count → {{ $json.stock ?? 0 }}
category → {{ $json.category.name }}
created_date → {{ new Date($json.created * 1000).toISOString().split('T')[0] }}
has_variants → {{ $json.variants.length > 0 }}See? No code, no JavaScript knowledge needed. Just expressions. This is the sweet spot for most data cleaning tasks.
Pro Tip: The Edit Fields node also has a “Dot Notation” option. If enabled, you can access nested fields like category.name directly. If disabled, it treats dots as literal characters in field names. I keep it enabled 99% of the time.
Approach 3: The Code Node (Full Developer Power)
When things get serious — complex conditionals, loops, regex operations, or processing arrays — the Code node is where you go. It supports both JavaScript and Python.
Here’s a real-world cleaning script I use for processing API responses:
// Code Node — Clean e-commerce API data
const items = $input.all();
const cleaned = [];
for (const item of items) {
const raw = item.json;
// Skip if product is missing essential fields
if (!raw.title || !raw.price) continue;
// Clean price — handle ₹, commas, random spaces
const priceStr = String(raw.price)
.replace(/[₹$€,\s]/g, '')
.trim();
const price = parseFloat(priceStr);
// Skip invalid prices
if (isNaN(price) || price <= 0) continue;
// Strip HTML from description
const cleanDescription = (raw.description || '')
.replace(/<[^>]*>/g, '')
.replace(/ /g, ' ')
.trim();
// Normalize category — handle nested or flat
let categoryName = 'Uncategorized';
if (typeof raw.category === 'object' && raw.category?.name) {
categoryName = raw.category.name;
} else if (typeof raw.category === 'string') {
categoryName = raw.category;
}
// Convert Unix timestamp or ISO string
let createdDate;
if (typeof raw.created === 'number') {
createdDate = new Date(raw.created * 1000).toISOString();
} else if (typeof raw.created === 'string') {
createdDate = new Date(raw.created).toISOString();
} else {
createdDate = new Date().toISOString();
}
cleaned.push({
json: {
product_id: raw.id,
name: raw.title.trim(),
price_inr: price,
stock: raw.stock ?? 0,
category: categoryName,
tags: Array.isArray(raw.tags) ? raw.tags.join(', ') : '',
description: cleanDescription.substring(0, 500),
has_variants: Array.isArray(raw.variants) && raw.variants.length > 0,
created_at: createdDate,
fetched_at: new Date().toISOString()
}
});
}
return cleaned;That script handles everything — null values, type mismatches, nested objects, HTML stripping, price normalization, date conversion, and even skips invalid records. This is how a developer thinks about data cleaning.
Note:- The Code node has two modes — “Run Once for All Items” and “Run Once for Each Item”. For data cleaning, almost always use “Run Once for All Items” because you can filter, merge, and restructure the entire dataset. The “Each Item” mode processes one at a time — useful for independent API calls but wasteful for cleaning.
Real-World Scenario: Weather API to Database Pipeline
Theory ke baad practice. Let’s build a real workflow — fetch weather data for Indian cities from an API, clean it, and store it in PostgreSQL. This is something you can genuinely use.
Step 1: The API Call
We’ll use Open-Meteo API — it’s free, no API key needed, and perfect for learning.
// HTTP Request Node Configuration
Method: GET
URL: https://api.open-meteo.com/v1/forecast
Query Parameters:
latitude: 28.6139
longitude: 77.2090
current_weather: true
hourly: temperature_2m,relative_humidity_2m
timezone: Asia/KolkataThe response you’ll get looks something like:
{
"latitude": 28.625,
"longitude": 77.21875,
"current_weather": {
"temperature": 34.2,
"windspeed": 12.5,
"weathercode": 2,
"time": "2026-04-11T14:00"
},
"hourly": {
"time": ["2026-04-11T00:00", "2026-04-11T01:00", ...],
"temperature_2m": [28.1, 27.3, 26.8, ...],
"relative_humidity_2m": [45, 48, 52, ...]
}
}See the problem? The hourly data is in separate arrays — time[], temperature_2m[], humidity[]. They’re parallel arrays — index 0 of each corresponds to the same hour. But n8n nodes work with objects, not parallel arrays. We need to zip them together.
Step 2: Clean & Transform
// Code Node — Transform weather data
const weather = $input.first().json;
const results = [];
const times = weather.hourly.time;
const temps = weather.hourly.temperature_2m;
const humidity = weather.hourly.relative_humidity_2m;
for (let i = 0; i < times.length; i++) {
// Parse the time
const dateObj = new Date(times[i]);
const hour = dateObj.getHours();
// Only keep daytime hours (6 AM to 10 PM)
if (hour < 6 || hour > 22) continue;
// Categorize temperature for Indian context
let tempCategory;
if (temps[i] >= 42) tempCategory = 'extreme_heat';
else if (temps[i] >= 35) tempCategory = 'hot';
else if (temps[i] >= 25) tempCategory = 'warm';
else if (temps[i] >= 15) tempCategory = 'pleasant';
else tempCategory = 'cold';
results.push({
json: {
city: 'Delhi',
recorded_at: times[i],
temperature_c: temps[i],
humidity_percent: humidity[i],
temp_category: tempCategory,
heat_index: calculateHeatIndex(temps[i], humidity[i]),
fetched_at: new Date().toISOString()
}
});
}
function calculateHeatIndex(temp, hum) {
// Simplified heat index for high temperatures
if (temp < 27) return temp;
const hi = -8.785 + 1.611*temp + 2.339*hum
- 0.146*temp*hum - 0.012*(temp**2)
- 0.016*(hum**2) + 0.002*(temp**2)*hum;
return Math.round(hi * 10) / 10;
}
return results;See what we did there? We zipped parallel arrays, filtered night hours, categorized temperatures for Indian weather context (because “warm” in Delhi is very different from “warm” in London), and even calculated a heat index. This is the kind of data cleaning that makes your downstream nodes happy.
Step 3: Store in PostgreSQL
-- Create the table first (run in Neon SQL Editor)
CREATE TABLE weather_logs (
id SERIAL PRIMARY KEY,
city VARCHAR(50) NOT NULL,
recorded_at TIMESTAMP WITH TIME ZONE NOT NULL,
temperature_c DECIMAL(5,2),
humidity_percent INTEGER,
temp_category VARCHAR(20),
heat_index DECIMAL(5,2),
fetched_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
UNIQUE(city, recorded_at)
);
CREATE INDEX idx_weather_city ON weather_logs(city);
CREATE INDEX idx_weather_recorded ON weather_logs(recorded_at);Then connect a PostgreSQL node with Insert or Update operation, matching on city + recorded_at. This way if you run the workflow multiple times, it updates existing records instead of creating duplicates. Clean.
10 Data Cleaning Patterns I Use Daily
This is the section I wish someone wrote for me when I started. These are copy-paste ready snippets you can drop into any Code node or expression field.
// 1. Phone number normalization (Indian)
const phone = raw.phone.replace(/[\s-()]/g, '');
const normalized = phone.startsWith('+91') ? phone
: phone.startsWith('91') ? '+' + phone
: '+91' + phone;
// 2. Email validation & lowercase
const email = (raw.email || '').trim().toLowerCase();
const isValid = /^[^\s@]+@[^\s@]+\.[^\s@]+$/.test(email);
// 3. Name title case
const titleCase = raw.name.replace(/\w\S*/g,
txt => txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase()
);
// 4. Remove emoji from text
const noEmoji = raw.text.replace(/[\u{1F600}-\u{1F9FF}]/gu, '').trim();
// 5. Parse Indian currency
const amount = parseFloat(
raw.price.replace(/[₹,\s]/g, '')
);
// 6. Flatten nested object
const flat = {
name: raw.user.profile.name,
city: raw.user.address?.city || 'Unknown',
pin: raw.user.address?.pincode || ''
};
// 7. Date format: DD/MM/YYYY to ISO
const [day, month, year] = raw.date.split('/');
const isoDate = new Date(year, month - 1, day).toISOString();
// 8. Deduplicate array
const unique = [...new Set(raw.tags)];
// 9. Safe JSON parse (from stringified fields)
let parsed;
try { parsed = JSON.parse(raw.metadata); }
catch { parsed = {}; }
// 10. Truncate long text with ellipsis
const short = raw.description.length > 200
? raw.description.substring(0, 197) + '...'
: raw.description;Bookmark this section. Seriously. You’ll come back to it more than you think.
Error Handling — Because APIs Will Fail
This is the section that separates hobby workflows from production-grade ones. APIs go down. They return 500 errors. They rate-limit you. They change their response format without telling you. You need to be ready.
Retry on Failure
The HTTP Request node has a built-in Retry on Fail option under Settings. Always enable this for critical API calls.
- Max Retries: 3 (don’t go higher, you’ll just hammer the API)
- Wait Between Retries: 1000ms minimum (some APIs ban you for rapid retries)
- Retry On: “All Errors” or specific status codes like 429 (rate limit), 500, 502, 503
The Error Workflow Pattern
Set up an Error Workflow that catches failures and notifies you. I covered this concept briefly in my WhatsApp Lead Agent blog — the Error Handler node (Node 27). Same principle applies here.
// Code Node — Validate API response before processing
const response = $input.first().json;
// Check if response is valid
if (!response || !response.data) {
throw new Error('API returned empty or invalid response');
}
// Check for API-level errors
if (response.status === 'error' || response.error) {
throw new Error('API error: ' + (response.message || 'Unknown error'));
}
// Check if data array exists and has items
if (!Array.isArray(response.data.products) || response.data.products.length === 0) {
// Return empty but valid result instead of crashing
return [{ json: { _empty: true, reason: 'No products found' } }];
}
// Data is valid — pass through
return $input.all();My rule of thumb: Always validate the API response before cleaning. A Code node right after the HTTP Request node that checks for errors, empty responses, and unexpected formats. 30 seconds to set up, saves hours of debugging later.
Rate Limiting and Batch Processing
Many APIs limit how many requests you can make per minute. If your workflow processes 500 items and each needs an API call, you’ll get rate-limited real quick.
Solutions:
- Split In Batches node: Process items in groups of 10 or 20 instead of all at once
- Wait node: Add a delay between batches (even 1 second helps)
- HTTP Request node’s “Batch Size” option: Built-in batching that limits concurrent requests
// Workflow structure for rate-limited APIs:
[Trigger] → [Split In Batches (10)] → [HTTP Request] → [Wait 1s] → [Loop Back]
↓
[Code: Clean Data]
↓
[PostgreSQL: Store]Pro Tip: If the API returns rate-limit headers (X-RateLimit-Remaining, Retry-After), you can read these in the Code node and dynamically adjust your wait time. That’s advanced stuff, but incredibly useful for high-volume workflows.
When to Use Which Approach — Decision Framework
I know I’ve thrown a lot at you. Let me simplify the decision with this table:
| Scenario | Best Approach | Why |
|---|---|---|
| Fix 1-2 fields | Expressions | No extra node needed |
| Restructure/rename 3-8 fields | Edit Fields node | Visual, no code, easy to debug |
| Complex logic, filtering, loops | Code node | Full programming power |
| Validate API response | Code node | Need conditionals and error throwing |
| Merge data from 2+ sources | Merge node + Edit Fields | Built-in merge logic handles most cases |
| Split arrays into items | Split Out node | One-click, no code needed |
Quick Homework (Optional, But Rewarding)
If you’ve read this far, you’re serious about learning. So here’s a mini challenge for you:
- Pick any free public API from this list
- Fetch data using HTTP Request node
- Clean it using at least 2 of the 3 approaches I taught
- Store the cleaned data in your Neon PostgreSQL database
- Run the workflow 3 times and verify no duplicates were created
If you get stuck, email me or drop a comment. I reply to every single one — I promise. You can also refer to my PostgreSQL operations guide for the database part.
Wrapping Up
Fetching APIs in n8n is easy. Fetching APIs properly — with authentication, pagination, error handling, and clean data pipelines — that’s where the skill is. And now you have that skill.
Let me give you the TL;DR one more time:
- Use the HTTP Request node for any API without a dedicated n8n node
- Always check authentication type in the API docs first
- Handle pagination with built-in options (and always set a max limit)
- Clean data using Expressions for quick fixes, Edit Fields for restructuring, Code node for complex transformations
- Always validate responses before processing
- Enable Retry on Fail for production workflows
- Use parameterized queries when storing data in PostgreSQL
That’s it for today’s session. This one was a long one — but I genuinely believe if you practice these patterns, you’ll build automations that actually work in production, not just in tutorials. Keep grinding, keep automating. See you in the next blog!
