.NET Programming With Me

Full-Text Search in PostgreSQL from a .NET 8 EF Core Controller API

Adding keyword search to an API often leads .NET developers toward Elasticsearch or Azure Cognitive Search before they have even looked at what their database already supports. PostgreSQL's built-in full-text search, backed by a GIN index, handles a large class of search requirements without any external service, and Npgsql exposes it cleanly in LINQ.

This post walks through configuring a generated tsvector column on a PostgreSQL table, indexing it correctly, and wiring up a working search endpoint in an ASP.NET Core Controller API using EF Core 8.


1 Prerequisites

Before starting, make sure you have:

  • .NET 8 SDK or later
  • PostgreSQL 14 or later
  • Npgsql.EntityFrameworkCore.PostgreSQL 8.x NuGet package
  • An ASP.NET Core Web API project with a configured DbContext
Note: All FTS examples use the english language configuration. PostgreSQL ships with configurations for many languages. Run SELECT cfgname FROM pg_ts_config; in psql to see what is available on your instance.

2 How PostgreSQL full-text search works

PostgreSQL FTS uses two building blocks. A tsvector is a pre-processed, sorted list of lexemes (normalised word roots) derived from a text document. A tsquery is a search expression with AND (&), OR (|), NOT (!), and phrase (<->) operators. The @@ match operator checks whether a tsquery matches a tsvector.

-- What tsvector looks like for a sentence:
SELECT to_tsvector('english', 'Building scalable APIs with .NET 8 and PostgreSQL');
-- Result: '.net':5 '8':6 'api':4 'build':1 'postgresql':8 'scalabl':3

-- What tsquery looks like for a user search term:
SELECT plainto_tsquery('english', 'scalable apis');
-- Result: 'scalabl' & 'api'

-- The match operator:
SELECT to_tsvector('english', 'Building scalable APIs with .NET 8')
    @@ plainto_tsquery('english', 'scalable apis');
-- Result: true

The key insight is that both sides are reduced to lexemes: "Building" becomes "build", "APIs" becomes "api", "scalable" becomes "scalabl". This is why FTS finds "APIs" when you search for "api" and why it ignores stop words like "with" and "and".


3 Adding a generated tsvector column

Rather than recomputing the tsvector on every search query, store it as a generated (computed) column that PostgreSQL keeps up to date automatically whenever the source columns change. Npgsql exposes a dedicated fluent API for this pattern.

The entity

using NpgsqlTypes;

public class Article
{
    public int Id { get; set; }
    public string Title { get; set; } = string.Empty;
    public string Body { get; set; } = string.Empty;
    public string Author { get; set; } = string.Empty;
    public DateTime PublishedAt { get; set; }

    // Populated automatically by PostgreSQL; never set this from C#
    public NpgsqlTsVector SearchVector { get; set; } = null!;
}

Configuring the generated column and GIN index

protected override void OnModelCreating(ModelBuilder modelBuilder)
{
    modelBuilder.Entity<Article>(entity =>
    {
        entity.HasKey(a => a.Id);

        // Tell Npgsql to create a GENERATED ALWAYS AS tsvector column
        // weighting Title more heavily than Body (A = highest, D = lowest)
        entity.HasGeneratedTsVectorColumn(
                a => a.SearchVector,
                "english",
                a => new { a.Title, a.Body })
            .HasIndex(a => a.SearchVector)
            .HasMethod("GIN");
    });
}

What the migration generates

-- Npgsql emits a STORED generated column backed by to_tsvector:
ALTER TABLE "Articles"
    ADD COLUMN "SearchVector" tsvector
    GENERATED ALWAYS AS (
        to_tsvector('english',
            coalesce("Title", '') || ' ' || coalesce("Body", ''))
    ) STORED;

CREATE INDEX "IX_Articles_SearchVector"
    ON "Articles" USING GIN ("SearchVector");

Run dotnet ef migrations add AddArticleSearchVector and dotnet ef database update. Inspect the migration to confirm both the generated column and the GIN index are present before applying.

Tip: If you need to weight the title higher than the body (so that a title match ranks above a body match), use a raw SQL migration to create the generated column manually with setweight(to_tsvector('english', "Title"), 'A') || setweight(to_tsvector('english', "Body"), 'B'). Npgsql's HasGeneratedTsVectorColumn concatenates columns without weights.

4 Writing the search endpoint

With the column and index in place, the Controller endpoint is straightforward. Use PlainToTsQuery rather than ToTsQuery for user-supplied input: PlainToTsQuery treats the input as plain text without requiring valid tsquery syntax, so it will not throw on arbitrary search strings.

The Controller

using Microsoft.AspNetCore.Mvc;
using Microsoft.EntityFrameworkCore;
using NpgsqlTypes;

[ApiController]
[Route("api/[controller]")]
public class ArticlesController(AppDbContext context) : ControllerBase
{
    [HttpGet("search")]
    public async Task<IActionResult> Search(
        [FromQuery] string q,
        [FromQuery] int page = 1,
        [FromQuery] int pageSize = 20)
    {
        if (string.IsNullOrWhiteSpace(q))
            return BadRequest("A search term is required.");

        // PlainToTsQuery is safe for arbitrary user input
        var tsQuery = EF.Functions.PlainToTsQuery("english", q.Trim());

        var results = await context.Articles
            .Where(a => a.SearchVector.Matches(tsQuery))
            .OrderByDescending(a => a.SearchVector.Rank(tsQuery))
            .Select(a => new ArticleSearchResult(
                a.Id,
                a.Title,
                a.Author,
                a.PublishedAt,
                a.Body.Length > 300 ? a.Body.Substring(0, 300) + "…" : a.Body))
            .Skip((page - 1) * pageSize)
            .Take(pageSize)
            .ToListAsync();

        return Ok(results);
    }
}

public record ArticleSearchResult(
    int Id,
    string Title,
    string Author,
    DateTime PublishedAt,
    string Excerpt);

What EF Core sends to PostgreSQL

SELECT a."Id", a."Title", a."Author", a."PublishedAt",
    CASE WHEN length(a."Body") > 300
         THEN substring(a."Body", 1, 300) || '…'
         ELSE a."Body" END
FROM "Articles" AS a
WHERE a."SearchVector" @@ plainto_tsquery('english', $1)
ORDER BY ts_rank(a."SearchVector", plainto_tsquery('english', $1)) DESC
LIMIT $2 OFFSET $3
Tip: WebSearchToTsQuery is an alternative to PlainToTsQuery available in PostgreSQL 11+. It understands Google-style syntax: quoted phrases, the -word exclusion operator, and OR. Use it when you want to expose advanced search syntax to users without building a custom parser.

5 Running and testing it

Seed the table with a few articles, then test the endpoint with representative queries to confirm the GIN index is being used.

// Quick seed in Program.cs (development only)
if (app.Environment.IsDevelopment())
{
    using var scope = app.Services.CreateScope();
    var db = scope.ServiceProvider.GetRequiredService<AppDbContext>();
    await db.Database.MigrateAsync();

    if (!await db.Articles.AnyAsync())
    {
        db.Articles.AddRange(
            new Article { Title = "Getting started with EF Core 8", Body = "Entity Framework Core 8 introduces JSON columns, bulk updates, and complex type mappings.", Author = "Anaya", PublishedAt = DateTime.UtcNow.AddDays(-10) },
            new Article { Title = "PostgreSQL indexing strategies", Body = "B-Tree, GIN, and partial indexes each serve a different purpose in PostgreSQL.", Author = "Anaya", PublishedAt = DateTime.UtcNow.AddDays(-5) }
        );
        await db.SaveChangesAsync();
    }
}
-- Confirm the GIN index is used (run in psql or pgAdmin)
EXPLAIN ANALYZE
SELECT * FROM "Articles"
WHERE "SearchVector" @@ plainto_tsquery('english', 'entity framework');

-- Look for: Bitmap Index Scan on "IX_Articles_SearchVector"
-- or: Index Scan using "IX_Articles_SearchVector"

A successful result shows a Bitmap Index Scan or Index Scan against IX_Articles_SearchVector, not a sequential scan. If you see a sequential scan on a freshly populated table, run ANALYZE "Articles"; to update the planner statistics and re-check.


Wrapping up

PostgreSQL's built-in full-text search, combined with Npgsql's HasGeneratedTsVectorColumn and the Matches LINQ extension, gives you a capable search endpoint without any external service. The generated column keeps the tsvector current automatically, the GIN index keeps searches fast at scale, and PlainToTsQuery handles arbitrary user input safely.

For the next step, consider adding highlighted snippets using PostgreSQL's ts_headline() function via EF.Functions or a raw SQL projection, and look at pg_trgm if you need fuzzy matching for misspellings alongside FTS relevance ranking.

Got a question or ran into a problem? Drop a comment below and I will reply.

No comments: