Part 3 — Efficient use of click-through rate via parent-child relationship

This part of the series introduces a new ranking signal: category click-through rate (CTR). The idea is that we can recommend popular content for users that don’t have a click history yet. Rather than just recommending based on articles, we recommend based on categories. However, these global CTR values can often change continuously, so we need an efficient way to update this value for all documents. We’ll do that by introducing parent-child relationships between documents in Vespa. We will also use sparse tensors directly in ranking. This post replicates this more detailed Vespa tutorial.

Photo by AbsolutVision on Unsplash

We assume that you have followed…

Retrieve paragraph and sentence level information with sparse and dense ranking features

We will walk through the steps necessary to create a question answering (QA) application that can retrieve sentence or paragraph level answers based on a combination of semantic and/or term-based search. We start by discussing the dataset used and the question and sentence embeddings generated for semantic search. We then include the steps necessary to create and deploy a Vespa application to serve the answers. We make all the required data available to feed the application and show how to query for sentence and paragraph level answers based on a combination of semantic and term-based search.

Photo by Brett Jordan on Unsplash

This tutorial is based…

Part 2 — From news search to news recommendations with embeddings

In this part, we’ll start transforming our application from news search to news recommendation using the embeddings created in this tutorial. An embedding vector will represent each user and news article. We will make the embeddings used available for download to make it easier to follow this post along. When a user comes, we retrieve his embedding and use it to retrieve the closest news articles via an approximate nearest neighbor (ANN) search. We also show that Vespa can jointly apply general filtering and ANN search, unlike competing alternatives available in the market.

Photo by Matt Popovich on Unsplash

We assume that you have followed the…

Part 1 — News search functionality

We will build a news recommendation app in Vespa without leaving a python environment. In this first part of the series, we want to develop an application with basic search functionality. Future posts will add recommendation capabilities based on embeddings and other ML models.

Photo by Filip Mishevski on Unsplash

This series is a simplified version of Vespa’s News search and recommendation tutorial. We will also use the demo version of the Microsoft News Dataset (MIND) so that anyone can follow along on their laptops.


The original Vespa news search tutorial provides a script to download, parse and convert the MIND dataset to Vespa format. …

Three ways to get started with pyvespa

pyvespa provides a python API to Vespa. The library’s primary goal is to allow for faster prototyping and facilitate Machine Learning experiments for Vespa applications.

Photo by Kristin Hillery on Unsplash

There are three ways you can get value out of pyvespa:

  1. You can connect to a running Vespa application.
  2. You can build and deploy a Vespa application using pyvespa API.
  3. You can deploy an application from Vespa config files stored on disk.

We will review each of those methods.

Connect to a running Vespa application

In case you already have a Vespa application running somewhere, you can directly instantiate the Vespa class with the appropriate endpoint. The example below connects to…

Introducing pyvespa simplified API. Build Vespa application from python with few lines of code.

This post will introduce you to the simplified pyvespa API that allows us to build a basic text search application from scratch with just a few code lines from python. Follow-up posts will add layers of complexity by incrementally building on top of the basic app described here.

Photo by Sarah Dorweiler on Unsplash

pyvespa exposes a subset of Vespa API in python. The library’s primary goal is to allow for faster prototyping and facilitate Machine Learning experiments for Vespa applications. I have written about how we can use it to connect…

How to ensure training and serving encoding compatibility

There are cases where the inputs to your Transformer model are pairs of sentences, but you want to process each sentence of the pair at different times due to your application’s nature. Search applications are one example.

Photo by Alice Dietrich on Unsplash

The search use case

Search applications involve a large collection of documents that can be pre-processed and stored before a search action is required. On the other hand, a query triggers a search action, and we can only process it in real-time. Search apps’ goal is to return the most relevant documents to the query as quickly as possible. …

Setup a custom Dataset, fine-tune BERT with Transformers Trainer, and export the model via ONNX

This post describes a simple way to get started with fine-tuning transformer models. It will cover the basics and introduce you to the amazing Trainer class from the transformers library. You can run the code from Google Colab but do not forget to enable GPU support.

Photo by Samule Sun on Unsplash

We use a dataset built from COVID-19 Open Research Dataset Challenge. This work is one small piece of a larger project that is to build the cord19 search app.

Install required libraries

!pip install pandas transformers

Load the dataset

To fine-tune the BERT models for the cord19 application, we need to generate a set of query-document features and labels that…

Using pyvespa to evaluate cord19 search application ranking functions currently in production.

This is the second on a series of blog posts that will show you how to improve a text search application, from downloading data to fine-tuning BERT models.

The previous post showed how to download and parse TREC-COVID data. This one will focus on evaluating two query models available in the cord19 search application. Those models will serve as baselines for future improvements.

You can also run the steps contained here from Google Colab.

Photo by Agence Olloweb on Unsplash

Download processed data

We can start by downloading the data that we have processed before.

import requests, json
from pandas import read_csv

topics = json.loads(requests.get(
relevance_data = read_csv(

A pyvespa library overview: Connect, query, collect data and evaluate query models.

Vespa is the faster, more scalable and advanced search engine currently available, imho. It has a native tensor evaluation framework, can perform approximate nearest neighbor search and deploy the latest developments in NLP modeling, such as BERT models.

This post will give you an overview of the Vespa python API available through the pyvespa library. The main goal of the library is to allow for faster prototyping and to facilitate Machine Learning experiments for Vespa applications.

Photo by David Clode on Unsplash

We are going to connect to the CORD-19 search app and use it as an example here. You can later use your own application…

Thiago G. Martins

Working on Follow me on Twitter @Thiagogm

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store