mirror of
https://github.com/MariaDB/server.git
synced 2025-08-29 00:08:14 +03:00
This bug was originally filed and fixed as Bug#12612184. The original fix was buggy, and it was patched by Bug#12704861. Also that patch was buggy (potentially breaking crash recovery), and both fixes were reverted. This fix was not ported to the built-in InnoDB of MySQL 5.1, because the function signatures of many core functions are different from InnoDB Plugin and later versions. The block allocation routines and their callers would have to changed so that they handle block descriptors instead of page frames. When a record is updated so that its size grows, non-updated columns can be selected for external (off-page) storage. The bug is that the initially inserted updated record contains an all-zero BLOB pointer to the field that was not updated. Only after the BLOB pages have been allocated and written, the valid pointer can be written to the record. Between the release of the page latch in mtr_commit(mtr) after btr_cur_pessimistic_update() and the re-latching of the page in btr_pcur_restore_position(), other threads can see the invalid BLOB pointer consisting of 20 zero bytes. Moreover, if the system crashes at this point, the situation could persist after crash recovery, and the contents of the non-updated column would be permanently lost. The problem is amplified by the ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED that were introduced in innodb_file_format=barracuda in InnoDB Plugin, but the bug does exist in all InnoDB versions. The fix is as follows. After a pessimistic B-tree operation that needs to write out off-page columns, allocate the pages for these columns in the mini-transaction that performed the B-tree operation (btr_mtr), but write the pages in a separate mini-transaction (blob_mtr). Do mtr_commit(blob_mtr) before mtr_commit(btr_mtr). A quirk: Do not reuse pages that were previously freed in btr_mtr. Only write the off-page columns to 'fresh' pages. In this way, crash recovery will see redo log entries for blob_mtr before any redo log entry for btr_mtr. It will apply the BLOB page writes to pages that were marked free at that point. If crash recovery fails to see all of the btr_mtr redo log, there will be some unreachable BLOB data in free pages, but the B-tree will be in a consistent state. btr_page_alloc_low(): Renamed from btr_page_alloc(). Add the parameter init_mtr. Return an allocated block, or NULL. If init_mtr!=mtr but the page was already X-latched in mtr, do not initialize the page. btr_page_alloc(): Wrapper for btr_page_alloc_for_ibuf() and btr_page_alloc_low(). btr_page_free(): Add a debug assertion that the page was a B-tree page. btr_lift_page_up(): Return the father block. btr_compress(), btr_cur_compress_if_useful(): Add the parameter ibool adjust, for adjusting the cursor position. btr_cur_pessimistic_update(): Preserve the cursor position when big_rec will be written and the new flag BTR_KEEP_POS_FLAG is defined. Remove a duplicate rec_get_offsets() call. Keep the X-latch on index->lock when big_rec is needed. btr_store_big_rec_extern_fields(): Replace update_inplace with an operation code, and local_mtr with btr_mtr. When not doing a fresh insert and btr_mtr has freed pages, put aside any pages that were previously X-latched in btr_mtr, and free the pages after writing out all data. The data must be written to 'fresh' pages, because btr_mtr will be committed and written to the redo log after the BLOB writes have been written to the redo log. btr_blob_op_is_update(): Check if an operation passed to btr_store_big_rec_extern_fields() is an update or insert-by-update. fseg_alloc_free_page_low(), fsp_alloc_free_page(), fseg_alloc_free_extent(), fseg_alloc_free_page_general(): Add the parameter init_mtr. Return an allocated block, or NULL. If init_mtr!=mtr but the page was already X-latched in mtr, do not initialize the page. xdes_get_descriptor_with_space_hdr(): Assert that the file space header is being X-latched. fsp_alloc_from_free_frag(): Refactored from fsp_alloc_free_page(). fsp_page_create(): New function, for allocating, X-latching and potentially initializing a page. If init_mtr!=mtr but the page was already X-latched in mtr, do not initialize the page. fsp_free_page(): Add ut_ad(0) to the error outcomes. fsp_free_page(), fseg_free_page_low(): Increment mtr->n_freed_pages. fsp_alloc_seg_inode_page(), fseg_create_general(): Assert that the page was not previously X-latched in the mini-transaction. A file segment or inode page should never be allocated in the middle of an mini-transaction that frees pages, such as btr_cur_pessimistic_delete(). fseg_alloc_free_page_low(): If the hinted page was allocated, skip the check if the tablespace should be extended. Return NULL instead of FIL_NULL on failure. Remove the flag frag_page_allocated. Instead, return directly, because the page would already have been initialized. fseg_find_free_frag_page_slot() would return ULINT_UNDEFINED on error, not FIL_NULL. Correct a bogus assertion. fseg_alloc_free_page(): Redefine as a wrapper macro around fseg_alloc_free_page_general(). buf_block_buf_fix_inc(): Move the definition from the buf0buf.ic to buf0buf.h, so that it can be called from other modules. mtr_t: Add n_freed_pages (number of pages that have been freed). page_rec_get_nth_const(), page_rec_get_nth(): The inverse function of page_rec_get_n_recs_before(), get the nth record of the record list. This is faster than iterating the linked list. Refactored from page_get_middle_rec(). trx_undo_rec_copy(): Add a debug assertion for the length. trx_undo_add_page(): Return a block descriptor or NULL instead of a page number or FIL_NULL. trx_undo_report_row_operation(): Add debug assertions. trx_sys_create_doublewrite_buf(): Assert that each page was not previously X-latched. page_cur_insert_rec_zip_reorg(): Make use of page_rec_get_nth(). row_ins_clust_index_entry_by_modify(): Pass BTR_KEEP_POS_FLAG, so that the repositioning of the cursor can be avoided. row_ins_index_entry_low(): Add DEBUG_SYNC points before and after writing off-page columns. If inserting by updating a delete-marked record, do not reposition the cursor or commit the mini-transaction before writing the off-page columns. row_build(): Tighten a debug assertion about null BLOB pointers. row_upd_clust_rec(): Add DEBUG_SYNC points before and after writing off-page columns. Do not reposition the cursor or commit the mini-transaction before writing the off-page columns. rb:939 approved by Jimmy Yang
1110 lines
29 KiB
Plaintext
1110 lines
29 KiB
Plaintext
/*****************************************************************************
|
|
|
|
Copyright (c) 1995, 2012, Oracle and/or its affiliates. All Rights Reserved.
|
|
Copyright (c) 2008, Google Inc.
|
|
|
|
Portions of this file contain modifications contributed and copyrighted by
|
|
Google, Inc. Those modifications are gratefully acknowledged and are described
|
|
briefly in the InnoDB documentation. The contributions by Google are
|
|
incorporated with their permission, and subject to the conditions contained in
|
|
the file COPYING.Google.
|
|
|
|
This program is free software; you can redistribute it and/or modify it under
|
|
the terms of the GNU General Public License as published by the Free Software
|
|
Foundation; version 2 of the License.
|
|
|
|
This program is distributed in the hope that it will be useful, but WITHOUT
|
|
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS
|
|
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
|
|
|
|
You should have received a copy of the GNU General Public License along with
|
|
this program; if not, write to the Free Software Foundation, Inc.,
|
|
51 Franklin Street, Suite 500, Boston, MA 02110-1335 USA
|
|
|
|
*****************************************************************************/
|
|
|
|
/**************************************************//**
|
|
@file include/buf0buf.ic
|
|
The database buffer buf_pool
|
|
|
|
Created 11/5/1995 Heikki Tuuri
|
|
*******************************************************/
|
|
|
|
#include "mtr0mtr.h"
|
|
#ifndef UNIV_HOTBACKUP
|
|
#include "buf0flu.h"
|
|
#include "buf0lru.h"
|
|
#include "buf0rea.h"
|
|
|
|
/********************************************************************//**
|
|
Reads the freed_page_clock of a buffer block.
|
|
@return freed_page_clock */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_page_get_freed_page_clock(
|
|
/*==========================*/
|
|
const buf_page_t* bpage) /*!< in: block */
|
|
{
|
|
/* This is sometimes read without holding buf_pool_mutex. */
|
|
return(bpage->freed_page_clock);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Reads the freed_page_clock of a buffer block.
|
|
@return freed_page_clock */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_block_get_freed_page_clock(
|
|
/*===========================*/
|
|
const buf_block_t* block) /*!< in: block */
|
|
{
|
|
return(buf_page_get_freed_page_clock(&block->page));
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Tells if a block is still close enough to the MRU end of the LRU list
|
|
meaning that it is not in danger of getting evicted and also implying
|
|
that it has been accessed recently.
|
|
Note that this is for heuristics only and does not reserve buffer pool
|
|
mutex.
|
|
@return TRUE if block is close to MRU end of LRU */
|
|
UNIV_INLINE
|
|
ibool
|
|
buf_page_peek_if_young(
|
|
/*===================*/
|
|
const buf_page_t* bpage) /*!< in: block */
|
|
{
|
|
/* FIXME: bpage->freed_page_clock is 31 bits */
|
|
return((buf_pool->freed_page_clock & ((1UL << 31) - 1))
|
|
< ((ulint) bpage->freed_page_clock
|
|
+ (buf_pool->curr_size
|
|
* (BUF_LRU_OLD_RATIO_DIV - buf_LRU_old_ratio)
|
|
/ (BUF_LRU_OLD_RATIO_DIV * 4))));
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Recommends a move of a block to the start of the LRU list if there is danger
|
|
of dropping from the buffer pool. NOTE: does not reserve the buffer pool
|
|
mutex.
|
|
@return TRUE if should be made younger */
|
|
UNIV_INLINE
|
|
ibool
|
|
buf_page_peek_if_too_old(
|
|
/*=====================*/
|
|
const buf_page_t* bpage) /*!< in: block to make younger */
|
|
{
|
|
if (UNIV_UNLIKELY(buf_pool->freed_page_clock == 0)) {
|
|
/* If eviction has not started yet, do not update the
|
|
statistics or move blocks in the LRU list. This is
|
|
either the warm-up phase or an in-memory workload. */
|
|
return(FALSE);
|
|
} else if (buf_LRU_old_threshold_ms && bpage->old) {
|
|
unsigned access_time = buf_page_is_accessed(bpage);
|
|
|
|
if (access_time > 0
|
|
&& ((ib_uint32_t) (ut_time_ms() - access_time))
|
|
>= buf_LRU_old_threshold_ms) {
|
|
return(TRUE);
|
|
}
|
|
|
|
buf_pool->stat.n_pages_not_made_young++;
|
|
return(FALSE);
|
|
} else {
|
|
return(!buf_page_peek_if_young(bpage));
|
|
}
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the current size of buffer buf_pool in bytes.
|
|
@return size in bytes */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_pool_get_curr_size(void)
|
|
/*========================*/
|
|
{
|
|
return(buf_pool->curr_size * UNIV_PAGE_SIZE);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Gets the smallest oldest_modification lsn for any page in the pool. Returns
|
|
zero if all modified pages have been flushed to disk.
|
|
@return oldest modification in pool, zero if none */
|
|
UNIV_INLINE
|
|
ib_uint64_t
|
|
buf_pool_get_oldest_modification(void)
|
|
/*==================================*/
|
|
{
|
|
buf_page_t* bpage;
|
|
ib_uint64_t lsn;
|
|
|
|
buf_pool_mutex_enter();
|
|
|
|
bpage = UT_LIST_GET_LAST(buf_pool->flush_list);
|
|
|
|
if (bpage == NULL) {
|
|
lsn = 0;
|
|
} else {
|
|
ut_ad(bpage->in_flush_list);
|
|
lsn = bpage->oldest_modification;
|
|
}
|
|
|
|
buf_pool_mutex_exit();
|
|
|
|
/* The returned answer may be out of date: the flush_list can
|
|
change after the mutex has been released. */
|
|
|
|
return(lsn);
|
|
}
|
|
#endif /* !UNIV_HOTBACKUP */
|
|
|
|
/*********************************************************************//**
|
|
Gets the state of a block.
|
|
@return state */
|
|
UNIV_INLINE
|
|
enum buf_page_state
|
|
buf_page_get_state(
|
|
/*===============*/
|
|
const buf_page_t* bpage) /*!< in: pointer to the control block */
|
|
{
|
|
enum buf_page_state state = (enum buf_page_state) bpage->state;
|
|
|
|
#ifdef UNIV_DEBUG
|
|
switch (state) {
|
|
case BUF_BLOCK_ZIP_FREE:
|
|
case BUF_BLOCK_ZIP_PAGE:
|
|
case BUF_BLOCK_ZIP_DIRTY:
|
|
case BUF_BLOCK_NOT_USED:
|
|
case BUF_BLOCK_READY_FOR_USE:
|
|
case BUF_BLOCK_FILE_PAGE:
|
|
case BUF_BLOCK_MEMORY:
|
|
case BUF_BLOCK_REMOVE_HASH:
|
|
break;
|
|
default:
|
|
ut_error;
|
|
}
|
|
#endif /* UNIV_DEBUG */
|
|
|
|
return(state);
|
|
}
|
|
/*********************************************************************//**
|
|
Gets the state of a block.
|
|
@return state */
|
|
UNIV_INLINE
|
|
enum buf_page_state
|
|
buf_block_get_state(
|
|
/*================*/
|
|
const buf_block_t* block) /*!< in: pointer to the control block */
|
|
{
|
|
return(buf_page_get_state(&block->page));
|
|
}
|
|
/*********************************************************************//**
|
|
Sets the state of a block. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_set_state(
|
|
/*===============*/
|
|
buf_page_t* bpage, /*!< in/out: pointer to control block */
|
|
enum buf_page_state state) /*!< in: state */
|
|
{
|
|
#ifdef UNIV_DEBUG
|
|
enum buf_page_state old_state = buf_page_get_state(bpage);
|
|
|
|
switch (old_state) {
|
|
case BUF_BLOCK_ZIP_FREE:
|
|
ut_error;
|
|
break;
|
|
case BUF_BLOCK_ZIP_PAGE:
|
|
ut_a(state == BUF_BLOCK_ZIP_DIRTY);
|
|
break;
|
|
case BUF_BLOCK_ZIP_DIRTY:
|
|
ut_a(state == BUF_BLOCK_ZIP_PAGE);
|
|
break;
|
|
case BUF_BLOCK_NOT_USED:
|
|
ut_a(state == BUF_BLOCK_READY_FOR_USE);
|
|
break;
|
|
case BUF_BLOCK_READY_FOR_USE:
|
|
ut_a(state == BUF_BLOCK_MEMORY
|
|
|| state == BUF_BLOCK_FILE_PAGE
|
|
|| state == BUF_BLOCK_NOT_USED);
|
|
break;
|
|
case BUF_BLOCK_MEMORY:
|
|
ut_a(state == BUF_BLOCK_NOT_USED);
|
|
break;
|
|
case BUF_BLOCK_FILE_PAGE:
|
|
ut_a(state == BUF_BLOCK_NOT_USED
|
|
|| state == BUF_BLOCK_REMOVE_HASH);
|
|
break;
|
|
case BUF_BLOCK_REMOVE_HASH:
|
|
ut_a(state == BUF_BLOCK_MEMORY);
|
|
break;
|
|
}
|
|
#endif /* UNIV_DEBUG */
|
|
bpage->state = state;
|
|
ut_ad(buf_page_get_state(bpage) == state);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Sets the state of a block. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_set_state(
|
|
/*================*/
|
|
buf_block_t* block, /*!< in/out: pointer to control block */
|
|
enum buf_page_state state) /*!< in: state */
|
|
{
|
|
buf_page_set_state(&block->page, state);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Determines if a block is mapped to a tablespace.
|
|
@return TRUE if mapped */
|
|
UNIV_INLINE
|
|
ibool
|
|
buf_page_in_file(
|
|
/*=============*/
|
|
const buf_page_t* bpage) /*!< in: pointer to control block */
|
|
{
|
|
switch (buf_page_get_state(bpage)) {
|
|
case BUF_BLOCK_ZIP_FREE:
|
|
/* This is a free page in buf_pool->zip_free[].
|
|
Such pages should only be accessed by the buddy allocator. */
|
|
ut_error;
|
|
break;
|
|
case BUF_BLOCK_ZIP_PAGE:
|
|
case BUF_BLOCK_ZIP_DIRTY:
|
|
case BUF_BLOCK_FILE_PAGE:
|
|
return(TRUE);
|
|
case BUF_BLOCK_NOT_USED:
|
|
case BUF_BLOCK_READY_FOR_USE:
|
|
case BUF_BLOCK_MEMORY:
|
|
case BUF_BLOCK_REMOVE_HASH:
|
|
break;
|
|
}
|
|
|
|
return(FALSE);
|
|
}
|
|
|
|
#ifndef UNIV_HOTBACKUP
|
|
/*********************************************************************//**
|
|
Determines if a block should be on unzip_LRU list.
|
|
@return TRUE if block belongs to unzip_LRU */
|
|
UNIV_INLINE
|
|
ibool
|
|
buf_page_belongs_to_unzip_LRU(
|
|
/*==========================*/
|
|
const buf_page_t* bpage) /*!< in: pointer to control block */
|
|
{
|
|
ut_ad(buf_page_in_file(bpage));
|
|
|
|
return(bpage->zip.data
|
|
&& buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the mutex of a block.
|
|
@return pointer to mutex protecting bpage */
|
|
UNIV_INLINE
|
|
mutex_t*
|
|
buf_page_get_mutex(
|
|
/*===============*/
|
|
const buf_page_t* bpage) /*!< in: pointer to control block */
|
|
{
|
|
switch (buf_page_get_state(bpage)) {
|
|
case BUF_BLOCK_ZIP_FREE:
|
|
ut_error;
|
|
return(NULL);
|
|
case BUF_BLOCK_ZIP_PAGE:
|
|
case BUF_BLOCK_ZIP_DIRTY:
|
|
return(&buf_pool_zip_mutex);
|
|
default:
|
|
return(&((buf_block_t*) bpage)->mutex);
|
|
}
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Get the flush type of a page.
|
|
@return flush type */
|
|
UNIV_INLINE
|
|
enum buf_flush
|
|
buf_page_get_flush_type(
|
|
/*====================*/
|
|
const buf_page_t* bpage) /*!< in: buffer page */
|
|
{
|
|
enum buf_flush flush_type = (enum buf_flush) bpage->flush_type;
|
|
|
|
#ifdef UNIV_DEBUG
|
|
switch (flush_type) {
|
|
case BUF_FLUSH_LRU:
|
|
case BUF_FLUSH_SINGLE_PAGE:
|
|
case BUF_FLUSH_LIST:
|
|
return(flush_type);
|
|
case BUF_FLUSH_N_TYPES:
|
|
break;
|
|
}
|
|
ut_error;
|
|
#endif /* UNIV_DEBUG */
|
|
return(flush_type);
|
|
}
|
|
/*********************************************************************//**
|
|
Set the flush type of a page. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_set_flush_type(
|
|
/*====================*/
|
|
buf_page_t* bpage, /*!< in: buffer page */
|
|
enum buf_flush flush_type) /*!< in: flush type */
|
|
{
|
|
bpage->flush_type = flush_type;
|
|
ut_ad(buf_page_get_flush_type(bpage) == flush_type);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Map a block to a file page. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_set_file_page(
|
|
/*====================*/
|
|
buf_block_t* block, /*!< in/out: pointer to control block */
|
|
ulint space, /*!< in: tablespace id */
|
|
ulint page_no)/*!< in: page number */
|
|
{
|
|
buf_block_set_state(block, BUF_BLOCK_FILE_PAGE);
|
|
block->page.space = space;
|
|
block->page.offset = page_no;
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the io_fix state of a block.
|
|
@return io_fix state */
|
|
UNIV_INLINE
|
|
enum buf_io_fix
|
|
buf_page_get_io_fix(
|
|
/*================*/
|
|
const buf_page_t* bpage) /*!< in: pointer to the control block */
|
|
{
|
|
enum buf_io_fix io_fix = (enum buf_io_fix) bpage->io_fix;
|
|
#ifdef UNIV_DEBUG
|
|
switch (io_fix) {
|
|
case BUF_IO_NONE:
|
|
case BUF_IO_READ:
|
|
case BUF_IO_WRITE:
|
|
return(io_fix);
|
|
}
|
|
ut_error;
|
|
#endif /* UNIV_DEBUG */
|
|
return(io_fix);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the io_fix state of a block.
|
|
@return io_fix state */
|
|
UNIV_INLINE
|
|
enum buf_io_fix
|
|
buf_block_get_io_fix(
|
|
/*================*/
|
|
const buf_block_t* block) /*!< in: pointer to the control block */
|
|
{
|
|
return(buf_page_get_io_fix(&block->page));
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Sets the io_fix state of a block. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_set_io_fix(
|
|
/*================*/
|
|
buf_page_t* bpage, /*!< in/out: control block */
|
|
enum buf_io_fix io_fix) /*!< in: io_fix state */
|
|
{
|
|
ut_ad(buf_pool_mutex_own());
|
|
ut_ad(mutex_own(buf_page_get_mutex(bpage)));
|
|
|
|
bpage->io_fix = io_fix;
|
|
ut_ad(buf_page_get_io_fix(bpage) == io_fix);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Sets the io_fix state of a block. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_set_io_fix(
|
|
/*=================*/
|
|
buf_block_t* block, /*!< in/out: control block */
|
|
enum buf_io_fix io_fix) /*!< in: io_fix state */
|
|
{
|
|
buf_page_set_io_fix(&block->page, io_fix);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Determine if a buffer block can be relocated in memory. The block
|
|
can be dirty, but it must not be I/O-fixed or bufferfixed. */
|
|
UNIV_INLINE
|
|
ibool
|
|
buf_page_can_relocate(
|
|
/*==================*/
|
|
const buf_page_t* bpage) /*!< control block being relocated */
|
|
{
|
|
ut_ad(buf_pool_mutex_own());
|
|
ut_ad(mutex_own(buf_page_get_mutex(bpage)));
|
|
ut_ad(buf_page_in_file(bpage));
|
|
ut_ad(bpage->in_LRU_list);
|
|
|
|
return(buf_page_get_io_fix(bpage) == BUF_IO_NONE
|
|
&& bpage->buf_fix_count == 0);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Determine if a block has been flagged old.
|
|
@return TRUE if old */
|
|
UNIV_INLINE
|
|
ibool
|
|
buf_page_is_old(
|
|
/*============*/
|
|
const buf_page_t* bpage) /*!< in: control block */
|
|
{
|
|
ut_ad(buf_page_in_file(bpage));
|
|
ut_ad(buf_pool_mutex_own());
|
|
|
|
return(bpage->old);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Flag a block old. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_set_old(
|
|
/*=============*/
|
|
buf_page_t* bpage, /*!< in/out: control block */
|
|
ibool old) /*!< in: old */
|
|
{
|
|
ut_a(buf_page_in_file(bpage));
|
|
ut_ad(buf_pool_mutex_own());
|
|
ut_ad(bpage->in_LRU_list);
|
|
|
|
#ifdef UNIV_LRU_DEBUG
|
|
ut_a((buf_pool->LRU_old_len == 0) == (buf_pool->LRU_old == NULL));
|
|
/* If a block is flagged "old", the LRU_old list must exist. */
|
|
ut_a(!old || buf_pool->LRU_old);
|
|
|
|
if (UT_LIST_GET_PREV(LRU, bpage) && UT_LIST_GET_NEXT(LRU, bpage)) {
|
|
const buf_page_t* prev = UT_LIST_GET_PREV(LRU, bpage);
|
|
const buf_page_t* next = UT_LIST_GET_NEXT(LRU, bpage);
|
|
if (prev->old == next->old) {
|
|
ut_a(prev->old == old);
|
|
} else {
|
|
ut_a(!prev->old);
|
|
ut_a(buf_pool->LRU_old == (old ? bpage : next));
|
|
}
|
|
}
|
|
#endif /* UNIV_LRU_DEBUG */
|
|
|
|
bpage->old = old;
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Determine the time of first access of a block in the buffer pool.
|
|
@return ut_time_ms() at the time of first access, 0 if not accessed */
|
|
UNIV_INLINE
|
|
unsigned
|
|
buf_page_is_accessed(
|
|
/*=================*/
|
|
const buf_page_t* bpage) /*!< in: control block */
|
|
{
|
|
ut_ad(buf_page_in_file(bpage));
|
|
|
|
return(bpage->access_time);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Flag a block accessed. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_set_accessed(
|
|
/*==================*/
|
|
buf_page_t* bpage, /*!< in/out: control block */
|
|
ulint time_ms) /*!< in: ut_time_ms() */
|
|
{
|
|
ut_a(buf_page_in_file(bpage));
|
|
ut_ad(buf_pool_mutex_own());
|
|
|
|
if (!bpage->access_time) {
|
|
/* Make this the time of the first access. */
|
|
bpage->access_time = time_ms;
|
|
}
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the buf_block_t handle of a buffered file block if an uncompressed
|
|
page frame exists, or NULL.
|
|
@return control block, or NULL */
|
|
UNIV_INLINE
|
|
buf_block_t*
|
|
buf_page_get_block(
|
|
/*===============*/
|
|
buf_page_t* bpage) /*!< in: control block, or NULL */
|
|
{
|
|
if (UNIV_LIKELY(bpage != NULL)) {
|
|
ut_ad(buf_page_in_file(bpage));
|
|
|
|
if (buf_page_get_state(bpage) == BUF_BLOCK_FILE_PAGE) {
|
|
return((buf_block_t*) bpage);
|
|
}
|
|
}
|
|
|
|
return(NULL);
|
|
}
|
|
#endif /* !UNIV_HOTBACKUP */
|
|
|
|
#ifdef UNIV_DEBUG
|
|
/*********************************************************************//**
|
|
Gets a pointer to the memory frame of a block.
|
|
@return pointer to the frame */
|
|
UNIV_INLINE
|
|
buf_frame_t*
|
|
buf_block_get_frame(
|
|
/*================*/
|
|
const buf_block_t* block) /*!< in: pointer to the control block */
|
|
{
|
|
ut_ad(block);
|
|
|
|
switch (buf_block_get_state(block)) {
|
|
case BUF_BLOCK_ZIP_FREE:
|
|
case BUF_BLOCK_ZIP_PAGE:
|
|
case BUF_BLOCK_ZIP_DIRTY:
|
|
case BUF_BLOCK_NOT_USED:
|
|
ut_error;
|
|
break;
|
|
case BUF_BLOCK_FILE_PAGE:
|
|
# ifndef UNIV_HOTBACKUP
|
|
ut_a(block->page.buf_fix_count > 0);
|
|
# endif /* !UNIV_HOTBACKUP */
|
|
/* fall through */
|
|
case BUF_BLOCK_READY_FOR_USE:
|
|
case BUF_BLOCK_MEMORY:
|
|
case BUF_BLOCK_REMOVE_HASH:
|
|
goto ok;
|
|
}
|
|
ut_error;
|
|
ok:
|
|
return((buf_frame_t*) block->frame);
|
|
}
|
|
#endif /* UNIV_DEBUG */
|
|
|
|
/*********************************************************************//**
|
|
Gets the space id of a block.
|
|
@return space id */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_page_get_space(
|
|
/*===============*/
|
|
const buf_page_t* bpage) /*!< in: pointer to the control block */
|
|
{
|
|
ut_ad(bpage);
|
|
ut_a(buf_page_in_file(bpage));
|
|
|
|
return(bpage->space);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the space id of a block.
|
|
@return space id */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_block_get_space(
|
|
/*================*/
|
|
const buf_block_t* block) /*!< in: pointer to the control block */
|
|
{
|
|
ut_ad(block);
|
|
ut_a(buf_block_get_state(block) == BUF_BLOCK_FILE_PAGE);
|
|
|
|
return(block->page.space);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the page number of a block.
|
|
@return page number */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_page_get_page_no(
|
|
/*=================*/
|
|
const buf_page_t* bpage) /*!< in: pointer to the control block */
|
|
{
|
|
ut_ad(bpage);
|
|
ut_a(buf_page_in_file(bpage));
|
|
|
|
return(bpage->offset);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the page number of a block.
|
|
@return page number */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_block_get_page_no(
|
|
/*==================*/
|
|
const buf_block_t* block) /*!< in: pointer to the control block */
|
|
{
|
|
ut_ad(block);
|
|
ut_a(buf_block_get_state(block) == BUF_BLOCK_FILE_PAGE);
|
|
|
|
return(block->page.offset);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the compressed page size of a block.
|
|
@return compressed page size, or 0 */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_page_get_zip_size(
|
|
/*==================*/
|
|
const buf_page_t* bpage) /*!< in: pointer to the control block */
|
|
{
|
|
return(bpage->zip.ssize ? 512 << bpage->zip.ssize : 0);
|
|
}
|
|
|
|
/*********************************************************************//**
|
|
Gets the compressed page size of a block.
|
|
@return compressed page size, or 0 */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_block_get_zip_size(
|
|
/*===================*/
|
|
const buf_block_t* block) /*!< in: pointer to the control block */
|
|
{
|
|
return(block->page.zip.ssize ? 512 << block->page.zip.ssize : 0);
|
|
}
|
|
|
|
#ifndef UNIV_HOTBACKUP
|
|
#if defined UNIV_DEBUG || defined UNIV_ZIP_DEBUG
|
|
/*********************************************************************//**
|
|
Gets the compressed page descriptor corresponding to an uncompressed page
|
|
if applicable.
|
|
@return compressed page descriptor, or NULL */
|
|
UNIV_INLINE
|
|
const page_zip_des_t*
|
|
buf_frame_get_page_zip(
|
|
/*===================*/
|
|
const byte* ptr) /*!< in: pointer to the page */
|
|
{
|
|
return(buf_block_get_page_zip(buf_block_align(ptr)));
|
|
}
|
|
#endif /* UNIV_DEBUG || UNIV_ZIP_DEBUG */
|
|
#endif /* !UNIV_HOTBACKUP */
|
|
|
|
/**********************************************************************//**
|
|
Gets the space id, page offset, and byte offset within page of a
|
|
pointer pointing to a buffer frame containing a file page. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_ptr_get_fsp_addr(
|
|
/*=================*/
|
|
const void* ptr, /*!< in: pointer to a buffer frame */
|
|
ulint* space, /*!< out: space id */
|
|
fil_addr_t* addr) /*!< out: page offset and byte offset */
|
|
{
|
|
const page_t* page = (const page_t*) ut_align_down(ptr,
|
|
UNIV_PAGE_SIZE);
|
|
|
|
*space = mach_read_from_4(page + FIL_PAGE_ARCH_LOG_NO_OR_SPACE_ID);
|
|
addr->page = mach_read_from_4(page + FIL_PAGE_OFFSET);
|
|
addr->boffset = ut_align_offset(ptr, UNIV_PAGE_SIZE);
|
|
}
|
|
|
|
#ifndef UNIV_HOTBACKUP
|
|
/**********************************************************************//**
|
|
Gets the hash value of the page the pointer is pointing to. This can be used
|
|
in searches in the lock hash table.
|
|
@return lock hash value */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_block_get_lock_hash_val(
|
|
/*========================*/
|
|
const buf_block_t* block) /*!< in: block */
|
|
{
|
|
ut_ad(block);
|
|
ut_ad(buf_page_in_file(&block->page));
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
ut_ad(rw_lock_own(&(((buf_block_t*) block)->lock), RW_LOCK_EXCLUSIVE)
|
|
|| rw_lock_own(&(((buf_block_t*) block)->lock), RW_LOCK_SHARED));
|
|
#endif /* UNIV_SYNC_DEBUG */
|
|
return(block->lock_hash_val);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Allocates a buf_page_t descriptor. This function must succeed. In case
|
|
of failure we assert in this function.
|
|
@return: the allocated descriptor. */
|
|
UNIV_INLINE
|
|
buf_page_t*
|
|
buf_page_alloc_descriptor(void)
|
|
/*===========================*/
|
|
{
|
|
buf_page_t* bpage;
|
|
|
|
bpage = (buf_page_t*) ut_malloc(sizeof *bpage);
|
|
ut_d(memset(bpage, 0, sizeof *bpage));
|
|
UNIV_MEM_ALLOC(bpage, sizeof *bpage);
|
|
|
|
return(bpage);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Free a buf_page_t descriptor. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_free_descriptor(
|
|
/*=====================*/
|
|
buf_page_t* bpage) /*!< in: bpage descriptor to free. */
|
|
{
|
|
ut_free(bpage);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Allocates a buffer block.
|
|
@return own: the allocated block, in state BUF_BLOCK_MEMORY */
|
|
UNIV_INLINE
|
|
buf_block_t*
|
|
buf_block_alloc(void)
|
|
/*=================*/
|
|
{
|
|
buf_block_t* block;
|
|
|
|
block = buf_LRU_get_free_block();
|
|
|
|
buf_block_set_state(block, BUF_BLOCK_MEMORY);
|
|
|
|
return(block);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Frees a buffer block which does not contain a file page. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_free(
|
|
/*===========*/
|
|
buf_block_t* block) /*!< in, own: block to be freed */
|
|
{
|
|
buf_pool_mutex_enter();
|
|
|
|
mutex_enter(&block->mutex);
|
|
|
|
ut_a(buf_block_get_state(block) != BUF_BLOCK_FILE_PAGE);
|
|
|
|
buf_LRU_block_free_non_file_page(block);
|
|
|
|
mutex_exit(&block->mutex);
|
|
|
|
buf_pool_mutex_exit();
|
|
}
|
|
#endif /* !UNIV_HOTBACKUP */
|
|
|
|
/*********************************************************************//**
|
|
Copies contents of a buffer frame to a given buffer.
|
|
@return buf */
|
|
UNIV_INLINE
|
|
byte*
|
|
buf_frame_copy(
|
|
/*===========*/
|
|
byte* buf, /*!< in: buffer to copy to */
|
|
const buf_frame_t* frame) /*!< in: buffer frame */
|
|
{
|
|
ut_ad(buf && frame);
|
|
|
|
ut_memcpy(buf, frame, UNIV_PAGE_SIZE);
|
|
|
|
return(buf);
|
|
}
|
|
|
|
#ifndef UNIV_HOTBACKUP
|
|
/********************************************************************//**
|
|
Calculates a folded value of a file page address to use in the page hash
|
|
table.
|
|
@return the folded value */
|
|
UNIV_INLINE
|
|
ulint
|
|
buf_page_address_fold(
|
|
/*==================*/
|
|
ulint space, /*!< in: space id */
|
|
ulint offset) /*!< in: offset of the page within space */
|
|
{
|
|
return((space << 20) + space + offset);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Gets the youngest modification log sequence number for a frame.
|
|
Returns zero if not file page or no modification occurred yet.
|
|
@return newest modification to page */
|
|
UNIV_INLINE
|
|
ib_uint64_t
|
|
buf_page_get_newest_modification(
|
|
/*=============================*/
|
|
const buf_page_t* bpage) /*!< in: block containing the
|
|
page frame */
|
|
{
|
|
ib_uint64_t lsn;
|
|
mutex_t* block_mutex = buf_page_get_mutex(bpage);
|
|
|
|
mutex_enter(block_mutex);
|
|
|
|
if (buf_page_in_file(bpage)) {
|
|
lsn = bpage->newest_modification;
|
|
} else {
|
|
lsn = 0;
|
|
}
|
|
|
|
mutex_exit(block_mutex);
|
|
|
|
return(lsn);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Increments the modify clock of a frame by 1. The caller must (1) own the
|
|
buf_pool mutex and block bufferfix count has to be zero, (2) or own an x-lock
|
|
on the block. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_modify_clock_inc(
|
|
/*=======================*/
|
|
buf_block_t* block) /*!< in: block */
|
|
{
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
ut_ad((buf_pool_mutex_own()
|
|
&& (block->page.buf_fix_count == 0))
|
|
|| rw_lock_own(&(block->lock), RW_LOCK_EXCLUSIVE));
|
|
#endif /* UNIV_SYNC_DEBUG */
|
|
|
|
block->modify_clock++;
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Returns the value of the modify clock. The caller must have an s-lock
|
|
or x-lock on the block.
|
|
@return value */
|
|
UNIV_INLINE
|
|
ib_uint64_t
|
|
buf_block_get_modify_clock(
|
|
/*=======================*/
|
|
buf_block_t* block) /*!< in: block */
|
|
{
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
ut_ad(rw_lock_own(&(block->lock), RW_LOCK_SHARED)
|
|
|| rw_lock_own(&(block->lock), RW_LOCK_EXCLUSIVE));
|
|
#endif /* UNIV_SYNC_DEBUG */
|
|
|
|
return(block->modify_clock);
|
|
}
|
|
|
|
/*******************************************************************//**
|
|
Increments the bufferfix count. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_buf_fix_inc_func(
|
|
/*=======================*/
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
const char* file, /*!< in: file name */
|
|
ulint line, /*!< in: line */
|
|
#endif /* UNIV_SYNC_DEBUG */
|
|
buf_block_t* block) /*!< in/out: block to bufferfix */
|
|
{
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
ibool ret;
|
|
|
|
ret = rw_lock_s_lock_nowait(&(block->debug_latch), file, line);
|
|
ut_a(ret);
|
|
#endif /* UNIV_SYNC_DEBUG */
|
|
ut_ad(mutex_own(&block->mutex));
|
|
|
|
block->page.buf_fix_count++;
|
|
}
|
|
|
|
/*******************************************************************//**
|
|
Decrements the bufferfix count. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_buf_fix_dec(
|
|
/*==================*/
|
|
buf_block_t* block) /*!< in/out: block to bufferunfix */
|
|
{
|
|
ut_ad(mutex_own(&block->mutex));
|
|
|
|
block->page.buf_fix_count--;
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
rw_lock_s_unlock(&block->debug_latch);
|
|
#endif
|
|
}
|
|
|
|
/******************************************************************//**
|
|
Returns the control block of a file page, NULL if not found.
|
|
@return block, NULL if not found */
|
|
UNIV_INLINE
|
|
buf_page_t*
|
|
buf_page_hash_get(
|
|
/*==============*/
|
|
ulint space, /*!< in: space id */
|
|
ulint offset) /*!< in: offset of the page within space */
|
|
{
|
|
buf_page_t* bpage;
|
|
ulint fold;
|
|
|
|
ut_ad(buf_pool);
|
|
ut_ad(buf_pool_mutex_own());
|
|
|
|
/* Look for the page in the hash table */
|
|
|
|
fold = buf_page_address_fold(space, offset);
|
|
|
|
HASH_SEARCH(hash, buf_pool->page_hash, fold, buf_page_t*, bpage,
|
|
ut_ad(bpage->in_page_hash && !bpage->in_zip_hash
|
|
&& buf_page_in_file(bpage)),
|
|
bpage->space == space && bpage->offset == offset);
|
|
if (bpage) {
|
|
ut_a(buf_page_in_file(bpage));
|
|
ut_ad(bpage->in_page_hash);
|
|
ut_ad(!bpage->in_zip_hash);
|
|
#if UNIV_WORD_SIZE == 4
|
|
/* On 32-bit systems, there is no padding in
|
|
buf_page_t. On other systems, Valgrind could complain
|
|
about uninitialized pad bytes. */
|
|
UNIV_MEM_ASSERT_RW(bpage, sizeof *bpage);
|
|
#endif
|
|
}
|
|
|
|
return(bpage);
|
|
}
|
|
|
|
/******************************************************************//**
|
|
Returns the control block of a file page, NULL if not found
|
|
or an uncompressed page frame does not exist.
|
|
@return block, NULL if not found */
|
|
UNIV_INLINE
|
|
buf_block_t*
|
|
buf_block_hash_get(
|
|
/*===============*/
|
|
ulint space, /*!< in: space id */
|
|
ulint offset) /*!< in: offset of the page within space */
|
|
{
|
|
return(buf_page_get_block(buf_page_hash_get(space, offset)));
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Returns TRUE if the page can be found in the buffer pool hash table.
|
|
|
|
NOTE that it is possible that the page is not yet read from disk,
|
|
though.
|
|
|
|
@return TRUE if found in the page hash table */
|
|
UNIV_INLINE
|
|
ibool
|
|
buf_page_peek(
|
|
/*==========*/
|
|
ulint space, /*!< in: space id */
|
|
ulint offset) /*!< in: page number */
|
|
{
|
|
const buf_page_t* bpage;
|
|
|
|
buf_pool_mutex_enter();
|
|
|
|
bpage = buf_page_hash_get(space, offset);
|
|
|
|
buf_pool_mutex_exit();
|
|
|
|
return(bpage != NULL);
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Releases a compressed-only page acquired with buf_page_get_zip(). */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_release_zip(
|
|
/*=================*/
|
|
buf_page_t* bpage) /*!< in: buffer block */
|
|
{
|
|
buf_block_t* block;
|
|
|
|
ut_ad(bpage);
|
|
ut_a(bpage->buf_fix_count > 0);
|
|
|
|
switch (buf_page_get_state(bpage)) {
|
|
case BUF_BLOCK_ZIP_PAGE:
|
|
case BUF_BLOCK_ZIP_DIRTY:
|
|
mutex_enter(&buf_pool_zip_mutex);
|
|
bpage->buf_fix_count--;
|
|
mutex_exit(&buf_pool_zip_mutex);
|
|
return;
|
|
case BUF_BLOCK_FILE_PAGE:
|
|
block = (buf_block_t*) bpage;
|
|
mutex_enter(&block->mutex);
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
rw_lock_s_unlock(&block->debug_latch);
|
|
#endif
|
|
bpage->buf_fix_count--;
|
|
mutex_exit(&block->mutex);
|
|
return;
|
|
case BUF_BLOCK_ZIP_FREE:
|
|
case BUF_BLOCK_NOT_USED:
|
|
case BUF_BLOCK_READY_FOR_USE:
|
|
case BUF_BLOCK_MEMORY:
|
|
case BUF_BLOCK_REMOVE_HASH:
|
|
break;
|
|
}
|
|
|
|
ut_error;
|
|
}
|
|
|
|
/********************************************************************//**
|
|
Decrements the bufferfix count of a buffer control block and releases
|
|
a latch, if specified. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_page_release(
|
|
/*=============*/
|
|
buf_block_t* block, /*!< in: buffer block */
|
|
ulint rw_latch, /*!< in: RW_S_LATCH, RW_X_LATCH,
|
|
RW_NO_LATCH */
|
|
mtr_t* mtr) /*!< in: mtr */
|
|
{
|
|
ut_ad(block);
|
|
|
|
ut_a(buf_block_get_state(block) == BUF_BLOCK_FILE_PAGE);
|
|
ut_a(block->page.buf_fix_count > 0);
|
|
|
|
if (rw_latch == RW_X_LATCH && mtr->modifications) {
|
|
buf_pool_mutex_enter();
|
|
buf_flush_note_modification(block, mtr);
|
|
buf_pool_mutex_exit();
|
|
}
|
|
|
|
mutex_enter(&block->mutex);
|
|
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
rw_lock_s_unlock(&(block->debug_latch));
|
|
#endif
|
|
block->page.buf_fix_count--;
|
|
|
|
mutex_exit(&block->mutex);
|
|
|
|
if (rw_latch == RW_S_LATCH) {
|
|
rw_lock_s_unlock(&(block->lock));
|
|
} else if (rw_latch == RW_X_LATCH) {
|
|
rw_lock_x_unlock(&(block->lock));
|
|
}
|
|
}
|
|
|
|
#ifdef UNIV_SYNC_DEBUG
|
|
/*********************************************************************//**
|
|
Adds latch level info for the rw-lock protecting the buffer frame. This
|
|
should be called in the debug version after a successful latching of a
|
|
page if we know the latching order level of the acquired latch. */
|
|
UNIV_INLINE
|
|
void
|
|
buf_block_dbg_add_level(
|
|
/*====================*/
|
|
buf_block_t* block, /*!< in: buffer page
|
|
where we have acquired latch */
|
|
ulint level) /*!< in: latching order level */
|
|
{
|
|
sync_thread_add_level(&block->lock, level, FALSE);
|
|
}
|
|
#endif /* UNIV_SYNC_DEBUG */
|
|
#endif /* !UNIV_HOTBACKUP */
|