.NET Programming With Me

Indexing Strategies in PostgreSQL That Every .NET Developer Should Know

Most .NET developers let EF Core generate their PostgreSQL indexes and then wonder why certain queries are slow in production. The default behaviour is reasonable, but PostgreSQL offers several index types that SQL Server does not, and knowing when to reach for each one is the difference between a fast API and a scaling problem.

This post covers the four index strategies that matter most for .NET API workloads: B-Tree, GIN, partial, and expression indexes. Each section explains what the index is for, shows how to create it (both via EF Core and raw SQL), and includes a EXPLAIN ANALYZE pattern to verify it is being used.


1 Prerequisites

Before starting, make sure you have:

  • .NET 8 SDK and PostgreSQL 14 or later
  • Npgsql.EntityFrameworkCore.PostgreSQL 8.x
  • Access to psql, pgAdmin, or DataGrip to run EXPLAIN ANALYZE
  • A working EF Core project with at least one entity mapped to a table
Note: Index decisions should always be validated with realistic data volumes. An index that helps at 100,000 rows may be ignored at 500 rows because the query planner considers a sequential scan cheaper on small tables.

2 B-Tree indexes — the default

Every HasIndex() call in EF Core creates a B-Tree index. B-Tree is the right choice for equality and range filters on scalar columns: integers, GUIDs, dates, strings, and decimals. If you are filtering by UserId, sorting by CreatedAt, or looking up by Email, B-Tree is what you want.

Creating B-Tree indexes with EF Core

modelBuilder.Entity<Order>(entity =>
{
    // Single-column index for equality lookups
    entity.HasIndex(o => o.CustomerId);

    // Composite index — column order matters
    // Queries that filter on Status AND sort by CreatedAt benefit from this
    entity.HasIndex(o => new { o.Status, o.CreatedAt });

    // Unique index — enforced at the database level
    entity.HasIndex(o => o.ReferenceNumber).IsUnique();
});

Composite index column ordering

For composite B-Tree indexes, put the highest-cardinality equality filter column first, followed by range or sort columns. A query that filters on Status = 'Pending' and orders by CreatedAt DESC uses the index above efficiently. A query that only filters on CreatedAt cannot use the leftmost prefix and will do a sequential scan.

-- Confirm index usage for the composite index
EXPLAIN ANALYZE
SELECT * FROM "Orders"
WHERE "Status" = 'Pending'
ORDER BY "CreatedAt" DESC
LIMIT 50;
-- Look for: Index Scan Backward using "IX_Orders_Status_CreatedAt"
Tip: A covering index (PostgreSQL 11+) stores additional columns in the index leaf nodes, allowing the query to be satisfied without touching the heap. Use IncludeProperties() in EF Core to add include columns to a B-Tree index.
// Covering index: the query planner can serve the SELECT entirely from the index
entity.HasIndex(o => new { o.Status, o.CreatedAt })
    .IncludeProperties(o => new { o.Id, o.TotalAmount });

3 GIN indexes — for JSONB, arrays, and full-text search

GIN (Generalised Inverted Index) indexes the internal structure of a composite value rather than the value itself. Use GIN for jsonb columns, PostgreSQL array columns, and tsvector full-text search columns. A B-Tree index on a jsonb column is useless for containment queries; GIN is what makes them fast.

GIN index on a JSONB column

// EF Core does not expose GIN natively; use a raw SQL migration
public partial class AddMetadataGinIndex : Migration
{
    protected override void Up(MigrationBuilder migrationBuilder)
    {
        // Full-document GIN index — covers all keys and values in the document
        migrationBuilder.Sql(
            """
            CREATE INDEX ix_products_metadata_gin
            ON "Products"
            USING GIN ("Metadata");
            """
        );
    }

    protected override void Down(MigrationBuilder migrationBuilder)
    {
        migrationBuilder.Sql(
            """DROP INDEX IF EXISTS ix_products_metadata_gin;"""
        );
    }
}

GIN index on a tsvector column

// When using HasGeneratedTsVectorColumn, chain HasMethod("GIN")
modelBuilder.Entity<Article>()
    .HasGeneratedTsVectorColumn(a => a.SearchVector, "english",
        a => new { a.Title, a.Body })
    .HasIndex(a => a.SearchVector)
    .HasMethod("GIN");  // Npgsql honours this for tsvector columns
Tip: GIN indexes are larger and slower to build than B-Tree indexes, but they are dramatically faster for containment and match queries on complex types. Use gin_pending_list_limit (default 4 MB) to control how much data is buffered in a pending list before being merged into the main index structure.

4 Partial indexes — index a subset of rows

A partial index includes only the rows that match a WHERE clause. If 95% of your Orders table has Status = 'Completed', an index that covers all statuses is bloated with rows you rarely query. A partial index on Status = 'Pending' is smaller, faster to scan, and faster to update because only new pending orders touch it.

Creating a partial index with EF Core

modelBuilder.Entity<Order>(entity =>
{
    // Only index rows where the order is not yet completed
    entity.HasIndex(o => o.CreatedAt)
        .HasFilter("\"Status\" != 'Completed'");
});
-- Equivalent SQL (generated by the migration):
CREATE INDEX "IX_Orders_CreatedAt"
ON "Orders" ("CreatedAt")
WHERE "Status" != 'Completed';

Soft-delete pattern with a partial index

// A very common pattern: index only non-deleted rows
modelBuilder.Entity<Customer>(entity =>
{
    entity.HasIndex(c => c.Email)
        .IsUnique()
        .HasFilter("\"IsDeleted\" = false");
    // This also enforces unique emails only among active customers
});

This pattern is particularly useful for soft-delete implementations: the unique constraint fires only for active records, and the index is small because deleted rows are excluded.


5 Expression indexes — index a computed value

An expression index (also called a functional index) indexes the result of an expression rather than a raw column value. Use this when your queries consistently apply a function to a column before filtering, such as lower-casing an email for case-insensitive lookups or extracting a date from a timestamp.

Case-insensitive email lookup

-- Create via a raw SQL migration (EF Core HasIndex does not support expressions)
CREATE INDEX ix_customers_email_lower
ON "Customers" (lower("Email"));
// Your LINQ query must use the same expression for the index to apply
var customer = await context.Customers
    .Where(c => c.Email.ToLower() == email.ToLower())
    .FirstOrDefaultAsync();

// EF Core + Npgsql translates ToLower() to lower(), which matches the index

Indexing a date extracted from a timestamp

-- Queries that filter on the date part of a timestamptz column
CREATE INDEX ix_orders_created_date
ON "Orders" (date_trunc('day', "CreatedAt" AT TIME ZONE 'UTC'));
Note: The expression in a LINQ Where clause must match the index expression exactly for the planner to use it. If the index is on lower("Email") but your query sends "Email" ILIKE $1, the index will not be used. Always verify with EXPLAIN ANALYZE.

6 Finding missing and unused indexes

PostgreSQL tracks index usage statistics in pg_stat_user_indexes. After running your application under realistic load, query this view to identify indexes that are never used (candidates for removal) and tables with high sequential scan counts but no supporting index (candidates for a new index).

-- Indexes that have never been used since the last statistics reset
SELECT
    schemaname,
    relname    AS table_name,
    indexrelname AS index_name,
    idx_scan   AS times_used
FROM pg_stat_user_indexes
WHERE idx_scan = 0
  AND schemaname = 'public'
ORDER BY relname;

-- Tables with high sequential scans (potential missing indexes)
SELECT
    relname    AS table_name,
    seq_scan,
    idx_scan,
    n_live_tup AS row_estimate
FROM pg_stat_user_tables
WHERE seq_scan > 100
  AND schemaname = 'public'
ORDER BY seq_scan DESC;

An unused index still incurs write overhead on every INSERT, UPDATE, and DELETE. Remove indexes that have never been used after several weeks of production traffic. Statistics are reset by pg_stat_reset() or when PostgreSQL restarts, so collect data over a meaningful time window before drawing conclusions.


Wrapping up

PostgreSQL's index toolkit is broader than most SQL Server developers realise. B-Tree covers the common cases that EF Core's HasIndex() already generates. GIN unlocks efficient querying of jsonb documents, arrays, and full-text search vectors. Partial indexes keep write-heavy tables lean by excluding rows you rarely read. Expression indexes push function application into the index so query-time function calls do not bypass it.

The consistent discipline across all four types is the same: define the index in a migration, load representative data, run EXPLAIN ANALYZE on your actual queries, and check pg_stat_user_indexes after the application has been under load. Never assume an index is being used without verifying it.

Got a question or ran into a problem? Drop a comment below and I will reply.

No comments: