HISTORY file update.

2025-07-28 23:42:10 +03:00 · 1998-10-24 04:43:39 +00:00
parent ba63dcd6a6
commit 30b2d287fb
4 changed files with 468 additions and 318 deletions
--- a/doc/FAQ_DEV
+++ b/doc/FAQ_DEV
@ -1,35 +1,40 @@
-Developer's Frequently Asked Questions (FAQ) for PostgreSQL
-
-Last updated: Wed Feb 11 20:23:01 EST 1998
-
-Current maintainer: Bruce Momjian (maillist@candle.pha.pa.us)
-
-The most recent version of this document can be viewed at the postgreSQL Web
-site, http://postgreSQL.org.
-
-  ------------------------------------------------------------------------
-
-Questions answered:
-
-1) What tools are available for developers?
-2) What books are good for developers?
-3) Why do we use palloc() and pfree() to allocate memory?
-4) Why do we use Node and List to make data structures?
-5) How do I add a feature or fix a bug?
-6) How do I download/update the current source tree?
-7) How do I test my changes?
-
-  ------------------------------------------------------------------------
-
-1) What tools are available for developers?
-
-Aside from the User documentation mentioned in the regular FAQ, there
-are several development tools available. First, all the files in the
-pgsql/src/tools directory are designed for developers.

+          Developer's Frequently Asked Questions (FAQ) for PostgreSQL
+                                       
+   Last updated: Fri Oct 2 15:21:32 EDT 1998
+   
+   Current maintainer: Bruce Momjian (maillist@candle.pha.pa.us)
+   
+   The most recent version of this document can be viewed at the
+   postgreSQL Web site, http://postgreSQL.org.
+     _________________________________________________________________
+   
+                                 Questions
+                                      
+   1) What tools are available for developers?
+   2) What books are good for developers?
+   3) Why do we use palloc() and pfree() to allocate memory?
+   4) Why do we use Node and List to make data structures?
+   5) How do I add a feature or fix a bug?
+   6) How do I download/update the current source tree?
+   7) How do I test my changes?
+   7) I just added a field to a structure. What else should I do?
+   8) Why are table, column, type, function, view names sometimes
+   referenced as Name or NameData, and sometimes as char *?
+   9) How do I efficiently access information in tables from the backend
+   code?
+   10) What is elog()?
+     _________________________________________________________________
+   
+  1) What tools are available for developers?
+  
+   Aside from the User documentation mentioned in the regular FAQ, there
+   are several development tools available. First, all the files in the
+   /tools directory are designed for developers.
        RELEASE_CHANGES         changes we have to make for each release
        SQL_keywords            standard SQL'92 keywords
-        backend                 web flowchart of the backend directories
+        backend                 description/flowchart of the backend directorie
+s
        ccsym                   find standard defines made by your compiler
        entab                   converts tabs to spaces, used by pgindent
        find_static             finds functions that could be made static
@ -42,104 +47,230 @@ pgsql/src/tools directory are designed for developers.
        mkldexport              create AIX exports file
        pgindent                indents C source files

-Let me note some of these. If you point your browser at the
-pgsql/src/tools/backend directory, you will see all the backend
-components in a flow chart. You can click on any one to see a
-description. If you then click on the directory name, you will be taken
-to the source directory, to browse the actual source code behind it. We
-also have several README files in some source directories to describe
-the function of the module. The browser will display these when you
-enter the directory also. The pgsql/src/tools/backend directory is also
-contained on our web page under the title Backend Flowchart.
+   Let me note some of these. If you point your browser at the
+   file:/usr/local/src/pgsql/src/tools/backend/index.html directory, you
+   will see few paragraphs describing the data flow, the backend
+   components in a flow chart, and a description of the shared memory
+   area. You can click on any flowchart box to see a description. If you
+   then click on the directory name, you will be taken to the source
+   directory, to browse the actual source code behind it. We also have
+   several README files in some source directories to describe the
+   function of the module. The browser will display these when you enter
+   the directory also. The tools/backend directory is also contained on
+   our web page under the title How PostgreSQL Processes a Query.
+   
+   Second, you really should have an editor that can handle tags, so you
+   can tag a function call to see the function definition, and then tag
+   inside that function to see an even lower-level function, and then
+   back out twice to return to the original function. Most editors
+   support this via tags or etags files.
+   
+   Third, you need to get mkid from ftp.postgresql.org. By running
+   tools/make_mkid, an archive of source symbols can be created that can
+   be rapidly queried like grep or edited.
+   
+   make_diff has tools to create patch diff files that can be applied to
+   the distribution.
+   
+   pgindent will format source files to match our standard format, which
+   has four-space tabs, and an indenting format specified by flags to the
+   your operating system's utility indent.
+   
+   pgindent is run on all source files just before each beta test period.
+   It auto-formats all source files to make them consistent. Comment
+   blocks that need specific line breaks should be formatted as block
+   comments, where the comment starts as /*------. These comments will
+   not be reformatted in any way.
+   
+  2) What books are good for developers?
+  
+   I have four good books, An Introduction to Database Systems, by C.J.
+   Date, Addison, Wesley, A Guide to the SQL Standard, by C.J. Date, et.
+   al, Addison, Wesley, Fundamentals of Database Systems, by Elmasri and
+   Navathe, and Transaction Processing, by Jim Gray, Morgan, Kaufmann
+   
+   There is also a database performance site, with a handbook on-line
+   written by Jim Gray at http://www.benchmarkresources.com.
+   
+  3) Why do we use palloc() and pfree() to allocate memory?
+  
+   palloc() and pfree() are used in place of malloc() and free() because
+   we automatically free all memory allocated when a transaction
+   completes. This makes it easier to make sure we free memory that gets
+   allocated in one place, but only freed much later. There are several
+   contexts that memory can be allocated in, and this controls when the
+   allocated memory is automatically freed by the backend.
+   
+  4) Why do we use Node and List to make data structures?
+  
+   We do this because this allows a consistent way to pass data inside
+   the backend in a flexible way. Every node has a NodeTag which
+   specifies what type of data is inside the Node. Lists are lists of
+   Nodes. lfirst(), lnext(), and foreach() are used to get, skip, and
+   traverse through Lists.
+   
+   You can print nodes easily inside gdb. First, to disable output
+   truncation:

-Second, you really should have an editor that can handle tags, so you can
-tag a function call to see the function definition, and then tag inside that
-function to see an even lower-level function, and then back out twice to
-return to the original function. Most editors support this via tags or etags
-files.
+        (gdb) set print elements 0

-Third, you need to get mkid from ftp.postgresql.org. By running
-tools/make_mkid, an archive of source symbols can be created that can be
-rapidly queried like grep or edited.
+   You may then use either of the next two commands to print out List,
+   Node, and structure contents. The first prints in a short format, and
+   the second in a long format:

-make_diff has tools to create patch diff files that can be applied to the
-distribution.
+        (gdb) call print(any_pointer)
+        (gdb) call pprint(any_pointer)

-pgindent will format source files to match our standard format, which has
-four-space tabs, and an indenting format specified by flags to the your
-operating system's utility indent.
+  5) How do I add a feature or fix a bug?
+  
+   The source code is over 250,000 lines. Many problems/features are
+   isolated to one specific area of the code. Others require knowledge of
+   much of the source. If you are confused about where to start, ask the
+   hackers list, and they will be glad to assess the complexity and give
+   pointers on where to start.
+   
+   Another thing to keep in mind is that many fixes and features can be
+   added with surprisingly little code. I often start by adding code,
+   then looking at other areas in the code where similar things are done,
+   and by the time I am finished, the patch is quite small and compact.
+   
+   When adding code, keep in mind that it should use the existing
+   facilities in the source, for performance reasons and for simplicity.
+   Often a review of existing code doing similar things is helpful.
+   
+  6) How do I download/update the current source tree?
+  
+   There are several ways to obtain the source tree. Occasional
+   developers can just get the most recent source tree snapshot from
+   ftp.postgresql.org. For regular developers, you can use CVS. CVS
+   allows you to download the source tree, then occasionally update your
+   copy of the source tree with any new changes. Using CVS, you don't
+   have to download the entire source each time, only the changed files.
+   Anonymous CVS does not allows developers to update the remote source
+   tree, though privileged developers can do this. There is a CVS FAQ on
+   our web site that describes how to use remote CVS. You can also use
+   CVSup, which has similarly functionality, and is available from
+   ftp.postgresql.org.
+   
+   To update the source tree, there are two ways. You can generate a
+   patch against your current source tree, perhaps using the make_diff
+   tools mentioned above, and send them to the patches list. They will be
+   reviewed, and applied in a timely manner. If the patch is major, and
+   we are in beta testing, the developers may wait for the final release
+   before applying your patches.
+   
+   For hard-core developers, Marc(scrappy@postgresql.org) will give you a
+   Unix shell account on postgresql.org, so you can use CVS to update the
+   main source tree, or you can ftp your files into your account, patch,
+   and cvs install the changes directly into the source tree.
+   
+  6) How do I test my changes?
+  
+   First, use psql to make sure it is working as you expect. Then run
+   src/test/regress and get the output of src/test/regress/checkresults
+   with and without your changes, to see that your patch does not change
+   the regression test in unexpected ways. This practice has saved me
+   many times. The regression tests test the code in ways I would never
+   do, and has caught many bugs in my patches. By finding the problems
+   now, you save yourself a lot of debugging later when things are
+   broken, and you can't figure out when it happened.
+   
+  7) I just added a field to a structure. What else should I do?
+  
+   The structures passing around from the parser, rewrite, optimizer, and
+   executor require quite a bit of support. Most structures have support
+   routines in src/backend/nodes used to create, copy, read, and output
+   those structures. Make sure you add support for your new field to
+   these files. Find any other places the structure may need code for
+   your new field. mkid is helpful with this (see above).
+   
+  8) Why are table, column, type, function, view names sometimes referenced as
+  Name or NameData, and sometimes as char *?
+  
+   Table, column, type, function, and view names are stored in system
+   tables in columns of type Name. Name is a fixed-length,
+   null-terminated type of NAMEDATALEN bytes. (The default value for
+   NAMEDATALEN is 32 bytes.)
+        typedef struct nameData
+        {
+            char        data[NAMEDATALEN];
+        } NameData;
+        typedef NameData *Name;

-2) What books are good for developers?
+   Table, column, type, function, and view names that come in to the
+   backend via user queries are stored as variable-length,
+   null-terminated character strings.
+   
+   Many functions are called with both types of names, ie. heap_open().
+   Because the Name type is null-terminated, it is safe to pass it to a
+   function expecting a char *. Because there are many cases where
+   on-disk names(Name) are compared to user-supplied names(char *), there
+   are many cases where Name and char * are used interchangeably.
+   
+  9) How do I efficiently access information in tables from the backend code?
+  
+   You first need to find the tuples(rows) you are interested in. There
+   are two ways. First, SearchSysCacheTuple() and related functions allow
+   you to query the system catalogs. This is the preferred way to access
+   system tables, because the first call to the cache loads the needed
+   rows, and future requests can return the results without accessing the
+   base table. Some of the caches use system table indexes to look up
+   tuples. A list of available caches is located in
+   src/backend/utils/cache/syscache.c.
+   src/backend/utils/cache/lsyscache.c contains many column-specific
+   cache lookup functions.
+   
+   The rows returned are cached-owned versions of the heap rows. They are
+   invalidated when the base table changes. Because the cache is local to
+   each backend, you may use the pointer returned from the cache for
+   short periods without making a copy of the tuple. If you send the
+   pointer into a large function that will be doing its own cache
+   lookups, it is possible the cache entry may be flushed, so you should
+   use SearchSysCacheTupleCopy() in these cases, and pfree() the tuple
+   when you are done.
+   
+   If you can't use the system cache, you will need to retrieve the data
+   directly from the heap table, using the buffer cache that is shared by
+   all backends. The backend automatically takes care of loading the rows
+   into the buffer cache.
+   
+   Open the table with heap_open(). You can then start a table scan with
+   heap_beginscan(), then use heap_getnext() and continue as long as
+   HeapTupleIsValid() returns true. Then do a heap_endscan(). Keys can be
+   assigned to the scan. No indexes are used, so all rows are going to be
+   compared to the keys, and only the valid rows returned.
+   
+   You can also use heap_fetch() to fetch rows by block number/offset.
+   While scans automatically lock/unlock rows from the buffer cache, with
+   heap_fetch(), you must pass a Buffer pointer, and ReleaseBuffer() it
+   when completed. Once you have the row, you can get data that is common
+   to all tuples, like t_ctid and t_oid, by mererly accessing the
+   HeapTuple structure entries. If you need a table-specific column, you
+   should take the HeapTuple pointer, and use the GETSTRUCT() macro to
+   access the table-specific start of the tuple. You then cast the
+   pointer as a Form_pg_proc pointer if you are accessing the pg_proc
+   table, or TypeTupleForm if you are accessing pg_type. You can then
+   access the columns by using a structure pointer:

-I have three good books, An Introduction to Database Systems, by C.J. Date,
-Addison, Wesley, A Guide to the SQL Standard, by C.J. Date, et. al,
-Addison, Wesley, and Transaction Processing:  Concepts and Techniques,
-by Jim Gray and Andreas Reuter, Morgan, Kaufmann.
+        ((Form_pg_class) GETSTRUCT(tuple))->relnatts

-3) Why do we use palloc() and pfree() to allocate memory?
-
-palloc() and pfree() are used in place of malloc() and free() because we
-automatically free all memory allocated when a transaction completes. This
-makes it easier to make sure we free memory that gets allocated in one
-place, but only freed much later. There are several contexts that memory can
-be allocated in, and this controls when the allocated memory is
-automatically freed by the backend.
-
-4) Why do we use Node and List to make data structures?
-
-We do this because this allows a consistent way to pass data inside the
-backend in a flexible way. Every node has a NodeTag which specifies what
-type of data is inside the Node. Lists are lists of Nodes. lfirst(),
-lnext(), and foreach() are used to get, skip, and traverse through Lists.
-
-5) How do I add a feature or fix a bug?
-
-The source code is over 250,000 lines. Many problems/features are isolated
-to one specific area of the code. Others require knowledge of much of the
-source. If you are confused about where to start, ask the hackers list, and
-they will be glad to assess the complexity and give pointers on where to
-start.
-
-Another thing to keep in mind is that many fixes and features can be added
-with surprisingly little code. I often start by adding code, then looking at
-other areas in the code where similar things are done, and by the time I am
-finished, the patch is quite small and compact.
-
-When adding code, keep in mind that it should use the existing facilities in
-the source, for performance reasons and for simplicity. Often a review of
-existing code doing similar things is helpful.
-
-6) How do I download/update the current source tree?
-
-There are several ways to obtain the source tree. Occasional developers can
-just get the most recent source tree snapshot from ftp.postgresql.org. For
-regular developers, you can use CVSup, which is available from
-ftp.postgresql.org too. CVSup allows you to download the source tree, then
-occasionally update your copy of the source tree with any new changes. Using
-CVSup, you don't have to download the entire source each time, only the
-changed files. CVSup does not allow developers to update the source tree.
-
-Anonymous CVS is available too.  See the doc/FAQ_CVS file for more
-information.
-
-To update the source tree, there are two ways. You can generate a patch
-against your current source tree, perhaps using the make_diff tools
-mentioned above, and send them to the patches list. They will be reviewed,
-and applied in a timely manner. If the patch is major, and we are in beta
-testing, the developers may wait for the final release before applying your
-patches.
-
-For hard-core developers, Marc(scrappy@postgresql.org) will give you a Unix
-shell account on postgresql.org, and you can ftp your files into your
-account, patch, and cvs install the changes directly into the source tree.
-
-6) How do I test my changes?
-
-First, use psql to make sure it is working as you expect. Then run
-src/test/regress and get the output of src/test/regress/checkresults with
-and without your changes, to see that your patch does not change the
-regression test in unexpected ways. This practice has saved me many times.
-The regression tests test the code in ways I would never do, and has caught
-many bugs in my patches. By finding the problems now, you save yourself a
-lot of debugging later when things are broken, and you can't figure out when
-it happened.
+   You should not directly change live tuples in this way. The best way
+   is to use heap_tuplemodify() and pass it your palloc'ed tuple, and the
+   values you want changed. It returns another palloc'ed tuple, which you
+   pass to heap_replace(). You can delete tuples by passing the tuple's
+   t_ctid to heap_destroy(). Remember, tuples can be either system cache
+   versions, which may go away soon after you get them, buffer cache
+   version, which will go away when you heap_getnext(), heap_endscan, or
+   ReleaseBuffer(), in the heap_fetch() case. Or it may be a palloc'ed
+   tuple, that you must pfree() when finished.
+   
+  10) What is elog()?
+  
+   elog() is used to send messages to the front-end, and optionally
+   terminate the current query being processed. The first parameter is an
+   elog level of NOTICE, DEBUG, ERROR, or FATAL. NOTICE prints on the
+   user's terminal and the postmaster logs. DEBUG prints only in the
+   postmaster logs. ERROR prints in both places, and terminates the
+   current query, never returning from the call. FATAL terminates the
+   backend process. The remaining parameters of elog are a printf-style
+   set of parameters to print.