Peter Seiderer
February 6, 2002
In the next chapter a simple solution for this problem is described. The solution uses a shadow table to store the older versions of database table entries.
In the third chapter a simple algorithm for delta compressing (computing the edited parts of a text) is described. This algorithm is then used in chapter four to introduce a space saving approach for the version control system.
In figure 2 an example of a database table is shown. The first row labeled id is the primary key, the second row labled data1 is some integer value and the third row labeld data2 is of type varchar. Such a (empty) table is easily created by the statement1:
CREATE TABLE table1 (id SERIAL, data1 INT, data2 VARCHAR);
A first proach to provide a simple form of version control for the database table table1 is to create a shadow table via the command
CREATE TABLE shadow_table1 (id INT, data1 INT, data2 INT, version INT);where old versions of the entries are automatically stored by a PL/SQL function which is triggered by an update on table entry.
Therefore first the PS/SQL function is created which stores the old entry plus an additional version number in the shadow table:
CREATE FUNCTION func_table1_vc () RETURNS OPAQUE AS ' DECLARE v integer; BEGIN SELECT INTO v version FROM shadow_table1 WHERE id = OLD.id ORDER BY version DESC LIMIT 1; IF v ISNULL THEN v := 0; ELSE v := v + 1; END IF; INSERT INTO shadow_table1 VALUES (OLD.id,OLD.data1,OLD.data2,v); RETURN NEW; END; ' LANGUAGE 'plpgsql';
The PL/SQL function is called by the following trigger on a update:
CREATE TRIGGER trig_table1_vc AFTER UPDATE ON table1 FOR EACH ROW EXECUTE PROCEDURE func_table1_vc();
So if the text of row two of the example is updated to ``Hello Reader'' the shadow table would look like the following:
Note that by the use of the shadow table to store the whole version information no change of the original table is necessary and no adjustment of running applications needs to be done.
As an example take the sentence ``looked there between'' which is changed to ``Chia looked out between''2.
Instead of storing the whole new sentence it would be sufficient to store only the edited parts ``Chia '' and ``out'' and the information which parts are copied from the original string as shown in figure 3.
For a detailed description on how to find such a delta see [Bur96] or [BCD95].
CREATE FUNCTION pg_diff(text, text) RETURNS bytea AS 'libpg_ci_diff.so' LANGUAGE 'c' WITH (isStrict);
CREATE FUNCTION pg_patch(text, bytea) RETURNS text AS 'libpg_ci_diff.so' LANGUAGE 'c' WITH (isStrict);.
The new update triggered procedure is changed now to the following:
CREATE FUNCTION func_table1_vc () RETURNS OPAQUE AS ' DECLARE v integer; BEGIN SELECT INTO v version FROM shadow_table1 WHERE id = OLD.id ORDER BY version DESC LIMIT 1; IF v ISNULL THEN v := 0; ELSE v := v + 1; END IF; INSERT INTO shadow_table1 VALUES (OLD.id,OLD.data1, pg_diff(NEW.data2,OLD.data2),v); RETURN NEW; END; ' LANGUAGE 'plpgsql';
With this configuration the update command from the previous example plus a additional update of the text to ``Hello Reader!'' will lead to the following entries in the database tables table1 and shadow_table1:
The entry ``c:0:6:i:5:World'' means copy the first 6 characters from ``Hello Reader'' and then insert the 5 character long string ``World''.
This is not a space saving example but imagine a some kbytes long text and a user only changing some minor typos.
The only thing remaining now is a comfortable view on the shadow_table, so not only to show the delta files but the whole different versions. This is done by the following:
CREATE FUNCTION patch_data1 (integer, integer) RETURNS VARCHAR AS ' DECLARE i ALIAS FOR $1; v ALIAS FOR $2; tmp varchar; delta RECORD; BEGIN SELECT INTO tmp data1 FROM test WHERE index = i; FOR delta IN SELECT data2 FROM shadow_test WHERE index = i AND version >= v ORDER BY version DESC LOOP tmp := pg_patch(tmp, delta.data2); END LOOP; RETURN tmp; END; ' LANGUAGE 'plpgsql';
CREATE VIEW view_shadow_table1 AS SELECT id AS id, data1 as data1 patch_data2(index, version) AS data2 version AS version FROM shadow_table1;
A SELECT * FROM view_shadow_table1; will lead to the following output: