MySQL Storage Engine

Storage Engines

MyISAM table type is mature, stable, and simple to manage

-MyISAM Static

-MyISAM Dynamic

-MyISAM Compressed

InnoDB -a modern alternative to MyISAM, which above all offers the following additional functions:

-Transactions

-Row Level Locking

-Foreign Key Constraints

-Crash Recovery

Limitations and Drawbacks:

-Tablespace Administration:the InnoDB table driver stores all data and indexes in a tablespace, comprising one or more files, that forms a sort of virtual file system

-Record Size:A data record can occupy at most 8000 bytes. This limit does not hold for TEXT and BLOB columns, of which only the first 512 bytes are stored in the database proper.

-Storage Requirement: The storage requirements for InnoDB tables are much greater than those for equivalent MyISAM tables (up to twice as big).

-Full-Text Index: For InnoDB tables one cannot use a full-text index.

-GIS Data: Geometric data cannot be stored in InnoDB tables.

-COUNT Problem: On account of open transactions, it is relatively difficult for the InnoDB table driver to determine the number of records in a table. Therefore, executing a SELECT COUNT(*) FROM TABLE is much slower than with MyISAM tables. This limitation should be eliminated soon.

-Table Locking: InnoDB uses its own locking algorithms in executing transactions.

MyISAM or InnoDB?

You can specify individually for each table in your database which table driver is to be used.

MyISAM tables are to be recommended whenever you want to manage tables in the most space- and time-efficient way possible. InnoDB tables, on the other hand, take precedence when your application makes use of transactions, requires greater security, or is to be accessed by many users simultaneously for making changes.

HEAP Tables

HEAP tables exist only in RAM (not on the hard drive). They use a hash index, which results in particularly fast access to individual data records. HEAP tables are often used as temporary tables.

Temporary Tables

With all of the table types listed above there exists the possibility of creating a table on a temporary basis. Such tables are automatically deleted as soon as the link with MySQL is terminated.

External Tables (type FEDERATED, since MySQL 5.0):

This table type enables access to tables in an external database. The database system can be located, for example, on another computer in the local network.There are some restrictions in accessing FEDERATED tables: Neither transactions nor query optimization with Query Cache are possible. The structure of external tables cannot be changed (though the

records can be). In other words, ALTER TABLE is not permitted, while INSERT,UPDATE, and DELETE are.

Lock Granularity

-One way to improve the concurrency of a shared resource is to be more selective about what is locked. Rather than locking the entire resource, lock only the part that contains the data you need to change. Better yet, lock only the exact piece of data you plan to change. By decreasing the amount of data that is locked at any one time, more changes can occur simultaneously—as long as they don’t conflict with each other.

Table locks

-The table as a whole is locked on an all-or-nothing basis. When a client wishes to write to a table (insert, delete, or update, etc.), it obtains a write lock that keeps all other read or write operations at bay for the duration of the operation. Once the write has completed, the table is unlocked to allow those waiting operations to continue.

Page locks

-A slightly more expensive form of locking that offers greater concurrency than table locking, a page lock is a lock applied to a portion of a table known as a page. All the records that reside on the same page in the table are affected by the lock.

Row locks

-In most applications, it’s relatively rare for several clients to need to update the exact same row at the same time. Row-level locking, as it’s commonly known, is available in MySQL’s InnoDB tables.

Multi-Version Concurrency Control(MVCC)

-Often referred to simply as versioning, MVCC is used by Oracle, by PostgreSQL, and by MySQL’s InnoDB storage engine. MVCC can be thought of as a new twist on row-level locking. It has the added benefit of allowing nonlocking reads while still locking the necessary records only during write operations.

Transactions

-A transaction is a group of SQL queries that are treated atomically, as a single unit of work. Either the entire group of queries is applied to a database, or none of them are.

ACID

-Atomicity, Consistency, Isolation, and Durability—four tightly related criteria that are required in a well-behaved transaction processing system.

Atomicity

-Transactions must function as a single indivisible unit of work. The entire transaction is either applied or rolled back. When transactions are atomic, there is no such thing as a partially completed transaction: it’s all or nothing.

Consistency

-The database should always move from one consistent state to the next. Consistency ensures that a crash between Steps 2 and 3 doesn’t result in $200 missing from the checking account. Because the transaction is never committed, none of the transaction’s changes are ever reflected in the database.

Isolation

-The results of a transaction are usually invisible to other transactions until the transaction is complete. This ensures that if a bank account summary runs after Step 2, but before Step 3, it still sees the $200 in the checking account. When we discuss isolation levels, you’ll understand why we said usually invisible.

Durability

-Once committed, the results of a transaction are permanent. This means that the changes must be recorded in such a way that system crashes won’t lose the data. Of course, if the database server’s disks fail, all bets are off. That’s a hardware problem.

Benefits and Drawbacks

-Because of the MySQL’s modularity you can decide on a per-table basis if you need ACID transactions or not, you don’t need to pay the performance penalty on a table that really won’t benefit from transactions.

Isolation Levels

-The SQL standard defines four isolation levels with specific rules for which changes are and aren’t visible inside and outside a transaction.

Read uncommitted

-In the read uncommitted isolation level, transactions can view the results of uncommitted transactions. At this level, many problems can occur unless you really, really know what you are doing and have a good reason for doing it. Read uncommitted is rarely used in practice. Reading uncommitted data is also known as a dirty read.

Read committed

-The default isolation level for most database systems is read committed. It satisfies the simple definition of isolation used earlier. A transaction will see the results only of transactions that were already committed when it began, and its changes won’t be visible to others until it’s committed.

Repeatable read

-At the repeatable read isolation level, any rows that are read during a transaction are locked so that they can’t be changed until the transaction finishes.

Serializable

-The highest level of isolation, serializable, solves the phantom read problem by ordering transactions so that they can’t conflict. At this level, a lot of timeouts and lock contention may occur, but the needs of your application may bring you to accept the decreased performance in favor of the data stability that results.

ANSI SQL isolation levels

Isolation level

Dirty reads possible

Non-repeatable reads possible

Phantom reads possible

Read uncommitted

Yes

Yes

Yes

Read committed

No

Yes

Yes

Repeatable read

No

No

Yes

Serializable

No

No

No

Transactions in MySQL

-MySQL provides two transaction-safe storage engines: Berkeley DB (BDB) and InnoDB

AUTOCOMMIT

-By default MySQL operates in AUTOCOMMIT mode. This means that unless you’ve explicitly begun a transaction, it automatically executes each query in a separate transaction. You can enable AUTOCOMMIT for the current connection by running:

SET AUTOCOMMIT = 1;

Disable it by executing:

SET AUTOCOMMIT = 0;

Mixing storage engines in transactions

-If you mix transaction-safe and non-transaction-safe tables (such as InnoDB and MyISAM) in a transaction, the transaction will work properly if all goes well. However, if a rollback is required, the changes to the non-transaction-safe table won’t be undone. This leaves the database in an inconsistent state that may be difficult to recover from (and renders the entire point of transactions moot).

Selecting the Right Engine

Transactions and concurrency

-When it comes to transactions and concurrency, consider the following guidelines:

-If your application requires transactions and high read/write concurrency, InnoDB is probably your best bet.

-If your application requires transactions but only moderate read/write concurrency, either BDB or InnoDB tables should work fine.

-If your application doesn’t require transactions and issues primarily SELECT or primarily INSERT/UPDATE queries, MyISAM is a good choice. Many web applications fall into this category.

Special features

-If your application requires referential integrity with foreign keys, you’re limited to just InnoDB tables. Do you need full-text search capabilities? Only MyISAM tables provide it.

-Non-MyISAM tables will generally use more CPU and disk space, but that may be a reasonable tradeoff in this case.

-Also, in the event of a crash, MyISAM tables may take quite a long time to check and repair while InnoDB tables should recover quickly.

Examples

Logging:

mod_log_sql for Apache -MyISAM tables works very well because they have very low overhead and can handle inserting thousands of records per second.

Read-only or read-mostly tables:

MyISAM tables-Tables that contain the data used to construct a catalog or listing of some sort (jobs, auctions, real estate, etc.) are usually read from far more often than they are written to.

Order processing:

When you deal with any sort of order processing, transactions are a requirement. Half-completed orders aren’t going to endear customers to your service. Using transaction-safe table types (InnoDB or BDB), these unfortunate “data surprises” can be avoided. Considering that BDB tables use—at best—locking at the page level, applications with high transaction volumes should consider InnoDB tables.

Storage engine features in MySQL

Attribute

MyISAM

Heap

BDB

InnoDB

Transactions

No

No

Yes

Yes

Lock granularity

Table

Table

Page (8 KB)

Row

Storage

Split files

In-memory

Single file per table

Tablespace(s)

Isolation levels

None

None

Read committed

All

Portable format

Yes

N/A

No

Yes

Referential integrity

No

No

No

Yes

Primary key with data

No

No

Yes

Yes

MySQL caches data records

No

Yes

Yes

Yes

Availability

All versions

All versions

MySQL-Max

All Versions

InnoDB Tables

-The InnoDB table handler breaks from MySQL tradition and stores all its data in a series of one or more data files that are collectively known as a tablespace. A tablespace is essentially a black box that is completely managed by InnoDB. If a tablespace if composed of several underlying files, you can’t choose or influence which of the underlying files will contain the data for any particular database or table.

One thought on “MySQL Storage Engine

  1. Pingback: NHibernate and MySQL’s MyISAM – Websofia

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>