During the last PGSession organized by Dalibo, I wrote and led a workshop
(french) on the migration to PostgreSQL using Foreign Data Wrappers, or FDW.
This was an opportunity to present to the general public the db_migrator
extension for which I wrote an article on this blog.
While working on this workshop, we noticed that copying data with the
db_migrator extension is not perfectly supported. Indeed, although there is a
low-level function to distribute the transfer table by table over several
processes, many situations will require writing a large number of SQL queries to
get out of trouble. Over the following months, I worked on the design of an
assistant written in PL/pgSQL whose purpose is to simplify the generation
of these queries.
In computer science, a migration project involves changing one or more technical
components without impacting any application behaviors. In the realm of
databases (and the profession I practice), it will involve choosing a new system
(such as PostgreSQL) to replace another (such as Oracle or Microsoft SQL
Server).
In a previous article (French), I described exhaustive steps to perform a
complete migration using Foreign Data Wrappers technology. However, the critical
data transfer step described there does not suit all situations. Let’s explore
together the alternatives that cover a large portion of the needs.
Over the past months, I have spent several weeks contributing to the
db_migrator extension. Written solely in PL/pgSQL, it enables the migration of
schemas and data from a database system to PostgreSQL using the external data I
had presented few years ago.
In this post, I present the functionality of the tool, its philosophy, and
the reason I found for its existence, even though it joins the ecosystem of
well-established open-source projects in the migration landscape. How does it
compare to Ora2Pg or pgloader in terms of value and capabilities?
Back-of-the-book index: includes names of people, places,
events, and concepts selected by the indexer as being relevant and of interest
to a possible reader of the book. (Wikipedia)
Database index: data structure that improves the speed of
data retrieval operations on a database table. (Wikipedia)
The parser stage creates a parse tree using only fixed rules about the syntactic
structure of SQL. It does not make any lookups in the system catalogs, so there
is no possibility to understand the detailed semantics of the requested operations.
What is going on from when a user sends his SQL query to getting back a data result?
This passionating question (by a limited amount of people, of course) has been
studied by Stefan Simkovics during his Master’s Thesis at Vienna University of
Technology in 1998.
His remarkable work was added in official documentation as “Overview of
PostgreSQL Internals”, which is intended to share Simkovics’ research in a
simplified way to reach a larger audience.
With this article, I’m thrilled to share recent thoughts about a subelement of
these internals, the parser. It relies on a similar approach to compiling by
using an advanced development pattern called AST (abstract syntax tree).