Django/migrate時の処理を読む（テーブル作成）

はじめに
django/db/backends/base/base.py
django/db/backends/base/schema.py
django/db/migrations/migration.py
- django/db/migrations/operations
BaseDatabaseSchemaEditor.create_model
BaseDatabaseSchemaEditor.create_model続き
BaseDatabaseWrapper.cursor
おわりに

はじめに †

makemigrationsでマイグレーションが作られる様子を見たのでいよいよマイグレーションが適用される様子を見ていきます。まずは復習、migrateコマンドの流れです。

アプリのmigrationsディレクトリにあるファイルを読み込む（django.db.migrations.loader.MigrationLoader）
実行するマイグレーションの決定（django.db.migrations.executor.MigrationExecutor）
各マイグレーションの実行（django.db.migrations.executor.MigrationExecutor）

で、前に見たように「各マイグレーションの実行」で肝となるのはMigrationExecutorクラスのapply_migrationメソッドの以下の部分

  1
  2

with self.connection.schema_editor(atomic=migration.atomic) as schema_editor:
    state = migration.apply(state, schema_editor)

connectionはデータベースとの接続、より具体的に言うと、django.db.backends.sqlite3.baseのDatabaseWrapperオブジェクトです。（もちろん、使うDBが別の場合はsqlite3の部分が変わります）

↑

django/db/backends/base/base.py †

schema_editorメソッドは基底クラスのBaseDatabaseWrapperの方に定義されています。

 
-
|
!

    def schema_editor(self, *args, **kwargs):
        """
        Returns a new instance of this backend's SchemaEditor.
        """
        if self.SchemaEditorClass is None:
            raise NotImplementedError(
                'The SchemaEditorClass attribute of this database wrapper is still None')
        return self.SchemaEditorClass(self, *args, **kwargs)

ちゃんと動くので、もちろんSchemaEditorClassは適切に設定されています。こんどはsqlite3以下の方（の抜粋）

from .schema import DatabaseSchemaEditor                    # isort:skip
 
class DatabaseWrapper(BaseDatabaseWrapper):
    SchemaEditorClass = DatabaseSchemaEditor

↑

django/db/backends/base/schema.py †

SchemaEditorの方も同じようにbaseと個別のDBMS用の実装の構成になっています。

基底クラスのBaseDatabaseSchemaEditorに以下の記述があります。

    def __enter__(self):
        self.deferred_sql = []
        if self.atomic_migration:
            self.atomic = atomic(self.connection.alias)
            self.atomic.__enter__()
        return self
 
    def __exit__(self, exc_type, exc_value, traceback):
        if exc_type is None:
            for sql in self.deferred_sql:
                self.execute(sql)
        if self.atomic_migration:
            self.atomic.__exit__(exc_type, exc_value, traceback)

with文に入るときに__enter__が実行され、出るときに__exit__が実行されるようです。 atomicは、実際なところatomicに実行されるようなのですがめんどくさいので無視。それ以外で注目するところとしてはdeferred_sqlです。名前から判断するとマイグレーションで指示されているoperationに対応するSQLをためておいて最後に実行している雰囲気です。

サブクラスのsqlite3.schema.DatabaseSchemaEditorでは__enter__時にデータベースとやり取りしているようですが、個別の話なのでまだそこには踏み込まないことにして、先にmigrationの方を確認しましょう。

↑

django/db/migrations/migration.py †

Migrationクラスのapplyメソッド

 
-
|
|
|
|
|
|
!
 
-
|
!
-
|
!
 
-
!
 
-
|
!
 
 
-
!

    def apply(self, project_state, schema_editor, collect_sql=False):
        """
        Takes a project_state representing all migrations prior to this one
        and a schema_editor for a live database and applies the migration
        in a forwards order.
 
        Returns the resulting project state for efficient re-use by following
        Migrations.
        """
        for operation in self.operations:
            # If this operation cannot be represented as SQL, place a comment
            # there instead
            if collect_sql:
                # 省略
            # Save the state before the operation has run
            old_state = project_state.clone()
            operation.state_forwards(self.app_label, project_state)
            # Run the operation
            atomic_operation = operation.atomic or (self.atomic and operation.atomic is not False)
            if not schema_editor.atomic_migration and atomic_operation:
                # Force a transaction on a non-transactional-DDL backend or an
                # atomic operation inside a non-atomic migration.
                with atomic(schema_editor.connection.alias):
                    operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
            else:
                # Normal behaviour
                operation.database_forwards(self.app_label, schema_editor, old_state, project_state)
        return project_state

atomicかどうかはともかく、各operationのdatabase_forwardsに続きます。

↑

django/db/migrations/operations †

基底クラスOperationのdatabase_forwardsは例外を投げるだけなので個々のサブクラス、models.CreateModelクラスのdatabase_forwardsを見てみましょう。

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        model = to_state.apps.get_model(app_label, self.name)
        if self.allow_migrate_model(schema_editor.connection.alias, model):
            schema_editor.create_model(model)

allow_migrateって、許可してもらわないと困るのですが(笑)、この先はOperation（基底クラス）→router（django.db.utilsのConnectionRouterオブジェクト）と進み処理が行われています。DATABASE_ROUTERSはモデルにより保存するデータベースを振り分ける仕組みのようですが、普通に使っている分には空配列なので単純にTrueが返されることになります。というわけでschema_editorのcreate_modelメソッドが呼び出されます。

↑

BaseDatabaseSchemaEditor.create_model †

さて、SchemaEditorに戻ってcreate_modelメソッドです。まず前半

 
-
|
|
!
-
!
 
 
-
!
 
 
-
!
 
 
-
!
 
 
 
-
!
 
 
 
 
 
 
 
 
 
-
!
 
 
 
-
!

    def create_model(self, model):
        """
        Takes a model and creates a table for it in the database.
        Will also create any accompanying indexes or unique constraints.
        """
        # Create column SQL, add FK deferreds if needed
        column_sqls = []
        params = []
        for field in model._meta.local_fields:
            # SQL
            definition, extra_params = self.column_sql(model, field)
            if definition is None:
                continue
            # Check constraints can go on the column SQL here
            db_params = field.db_parameters(connection=self.connection)
            if db_params['check']:
                definition += " CHECK (%s)" % db_params['check']
            # Autoincrement SQL (for backends with inline variant)
            col_type_suffix = field.db_type_suffix(connection=self.connection)
            if col_type_suffix:
                definition += " %s" % col_type_suffix
            params.extend(extra_params)
            # FK
            if field.remote_field and field.db_constraint:
                to_table = field.remote_field.model._meta.db_table
                to_column = field.remote_field.model._meta.get_field(field.remote_field.field_name).column
                if self.connection.features.supports_foreign_keys:
                    self.deferred_sql.append(self._create_fk_sql(model, field, "_fk_%(to_table)s_%(to_column)s"))
                elif self.sql_create_inline_fk:
                    definition += " " + self.sql_create_inline_fk % {
                        "to_table": self.quote_name(to_table),
                        "to_column": self.quote_name(to_column),
                    }
            # Add the SQL to our big list
            column_sqls.append("%s %s" % (
                self.quote_name(field.column),
                definition,
            ))
            # Autoincrement SQL (for backends with post table definition variant)
            if field.get_internal_type() in ("AutoField", "BigAutoField"):
                autoinc_sql = self.connection.ops.autoinc_sql(model._meta.db_table, field.column)
                if autoinc_sql:
                    self.deferred_sql.extend(autoinc_sql)

モデルのフィールドごとにSQLの列を作っている雰囲気です。こういう場合は入力と出力を見てどのようなことが行われているか見るのが一番、ということで、入力、

        migrations.CreateModel(
            name='Choice',
            fields=[
                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
                ('choice_text', models.CharField(max_length=200)),
                ('votes', models.IntegerField(default=0)),
            ],
        ),

対応する出力（SQLiteの場合。「python manage.py sqlmigrate polls 0001」で確認できます）

CREATE TABLE "polls_choice" ("id" integer NOT NULL PRIMARY KEY AUTOINCREMENT,
"choice_text" varchar(200) NOT NULL, "votes" integer NOT NULL);

では追いかけていきましょう。

↑

column_sql †

coolumn_sqlメソッド。include_defaultがFalseで呼ばれているからデフォルト値の処理がされてないと思うけどそれでいいのだろうか。

 
-
|
|
!
-
!
 
 
-
!
 
-
!
-
!
 
-
|
|
!
 
 
 
 
 
 
-
!
 
 
 
-
!
 
 
-
!

    def column_sql(self, model, field, include_default=False):
        """
        Takes a field and returns its column definition.
        The field must already have had set_attributes_from_name called.
        """
        # Get the column's type and use that as the basis of the SQL
        db_params = field.db_parameters(connection=self.connection)
        sql = db_params['type']
        params = []
        # Check for fields that aren't actually columns (e.g. M2M)
        if sql is None:
            return None, None
        # Work out nullability
        null = field.null
        # If we were told to include a default value, do so
        include_default = include_default and not self.skip_default(field)
        if include_default:
            # 省略
        # Oracle treats the empty string ('') as null, so coerce the null
        # option whenever '' is a possible value.
        if (field.empty_strings_allowed and not field.primary_key and
                self.connection.features.interprets_empty_strings_as_nulls):
            null = True
        if null and not self.connection.features.implied_column_null:
            sql += " NULL"
        elif not null:
            sql += " NOT NULL"
        # Primary key/unique outputs
        if field.primary_key:
            sql += " PRIMARY KEY"
        elif field.unique:
            sql += " UNIQUE"
        # Optionally add the tablespace if it's an implicitly indexed column
        tablespace = field.db_tablespace or model._meta.db_tablespace
        if tablespace and self.connection.features.supports_tablespaces and field.unique:
            sql += " %s" % self.connection.ops.tablespace_sql(tablespace, inline=True)
        # Return the sql
        return sql, params

fields、すなわち、django.db.models.fields.Fieldのdb_parametersと関連メソッド。サブクラスでオーバーライドされていることもありますが基本は同じです。

 
-
|
|
|
!
 
 
 
 
 
 
 
 
-
|
|
!

    def db_parameters(self, connection):
        """
        Extension of db_type(), providing a range of different return
        values (type, checks).
        This will look at db_type(), allowing custom model fields to override it.
        """
        type_string = self.db_type(connection)
        check_string = self.db_check(connection)
        return {
            "type": type_string,
            "check": check_string,
        }
 
    def db_type(self, connection):
        """
        Return the database column data type for this field, for the provided
        connection.
        """
        data = DictWrapper(self.__dict__, connection.ops.quote_name, "qn_")
        try:
            return connection.data_types[self.get_internal_type()] % data
        except KeyError:
            return None
 
    def get_internal_type(self):
        return self.__class__.__name__

connection、つまり、DatabaseWrapperクラスを確認するとフィールドに対応するデータベースでの型が記述されています。

    data_types = {
        'AutoField': 'integer',
        'CharField': 'varchar(%(max_length)s)',
        'IntegerField': 'integer',
        他のマッピング...
    }

CharFieldの場合、selfの__dict__（をラップしたもの）を渡すことで、「%(max_length)s」の部分が置換され、「varchar(200)」のようになります。ともかく、column_sqlメソッドが実行することで「integer NOT NULL PRIMARY KEY」のような型と制約を表すSQL文字列が取得できました。

↑

オートインクリメントの処理（テーブル定義内） †

制御がcreate_modelに戻って今度はFieldクラスのdb_type_suffixメソッドを呼んでいます。

-
!

            # Autoincrement SQL (for backends with inline variant)
            col_type_suffix = field.db_type_suffix(connection=self.connection)
            if col_type_suffix:
                definition += " %s" % col_type_suffix

  1
  2

    def db_type_suffix(self, connection):
        return connection.data_types_suffix.get(self.get_internal_type())

connection（DatabaseWrapper）のdata_types定義（SQLiteのやつ）

    data_types_suffix = {
        'AutoField': 'AUTOINCREMENT',
        'BigAutoField': 'AUTOINCREMENT',
    }

というわけでAUTOINCREMENTが付加されます。余談ですが、ループの最後にあるオートインクリメントSQLが実際追加されるのはOracleだけなようです。

↑

外部キーの処理 †

先ほど挙げた中には外部キーの記述はなかったのですが、というかDjangoでは外部キーは別途AddFieldで追加される、かつ、SQLiteの場合はフィールド一つ追加するのに新しいテーブル作ってコピってという変なことをしているので（ALTER TABLEの制限のためらしいです）、そういう頑張っている部分を頑張って見るのはやめますが、ともかく外部キーがどう処理されているか見ておきましょう。

sqlmigrateで出力すると外部キーの部分は次のようになります。

"question_id" integer NOT NULL REFERENCES "polls_question" ("id")

REFERENCES以下を作っているのはここ

                if self.connection.features.supports_foreign_keys:
                    self.deferred_sql.append(self._create_fk_sql(model, field, "_fk_%(to_table)s_%(to_column)s"))
                elif self.sql_create_inline_fk:
                    definition += " " + self.sql_create_inline_fk % {
                        "to_table": self.quote_name(to_table),
                        "to_column": self.quote_name(to_column),
                    }

featuresはfeaturesモジュールのDatabaseFeaturesクラスです。で、SQLiteの場合はsupports_foreign_keysはFalseになっています（先ほど書いたALTER TABLEの制限のためのようです）。その代わり、sql_create_inline_fkに値が設定されており（こちらはSchemaEditorです）REFERENCES以下が作られるようになっています。

↑

BaseDatabaseSchemaEditor.create_model続き †

こんな感じに各フィールドのSQL表現ができたらいよいよテーブルの作成です。

-
|
!
 
 
-
!
 
 
 
 
 
 
 
-
!
 
-
!
 
-
!

        # Add any unique_togethers (always deferred, as some fields might be
        # created afterwards, like geometry fields with some backends)
        for fields in model._meta.unique_together:
            columns = [model._meta.get_field(field).column for field in fields]
            self.deferred_sql.append(self._create_unique_sql(model, columns))
        # Make the table
        sql = self.sql_create_table % {
            "table": self.quote_name(model._meta.db_table),
            "definition": ", ".join(column_sqls)
        }
        if model._meta.db_tablespace:
            tablespace_sql = self.connection.ops.tablespace_sql(model._meta.db_tablespace)
            if tablespace_sql:
                sql += ' ' + tablespace_sql
        # Prevent using [] as params, in the case a literal '%' is used in the definition
        self.execute(sql, params or None)
 
        # Add any field index and index_together's (deferred as SQLite3 _remake_table needs it)
        self.deferred_sql.extend(self._model_indexes_sql(model))
 
        # Make M2M tables
        for field in model._meta.local_many_to_many:
            if field.remote_field.through._meta.auto_created:
                self.create_model(field.remote_field.through)

いろいろやっていますが、executeだけ見ておけばいいでしょう。

 
-
|
!
-
!
 
-
!

    def execute(self, sql, params=[]):
        """
        Executes the given SQL statement, with optional parameters.
        """
        # Log the command we're running, then run it
        logger.debug("%s; (params %r)", sql, params, extra={'params': params, 'sql': sql})
        if self.collect_sql:
            # 省略
        else:
            with self.connection.cursor() as cursor:
                cursor.execute(sql, params)

connectionからcursorを取得し、そちらに処理を委譲しています。

↑

BaseDatabaseWrapper.cursor †

話がDatabaseWrapperにやってきました。cursorメソッドは基底クラスのBaseDatabaseWrapperに定義されています。

 
-
|
!

    def cursor(self):
        """
        Creates a cursor, opening a connection if necessary.
        """
        self.validate_thread_sharing()
        if self.queries_logged:
            cursor = self.make_debug_cursor(self._cursor())
        else:
            cursor = self.make_cursor(self._cursor())
        return cursor

場合分けされていますが、_cursorとmake_cursorを見ておけばいいでしょう。まず_cursor。

    def _cursor(self):
        self.ensure_connection()
        with self.wrap_database_errors:
            return self.create_cursor()

↑

connect †

ensure_connectionに進む。

 
-
|
!

    def ensure_connection(self):
        """
        Guarantees that a connection to the database is established.
        """
        if self.connection is None:
            with self.wrap_database_errors:
                self.connect()

しつこい(笑)。connectメソッドです。

 
 
-
!
-
!
 
 
-
!
 
 
 
-
!

    def connect(self):
        """Connects to the database. Assumes that the connection is closed."""
        # Check for invalid configurations.
        self.check_settings()
        # In case the previous connection was closed while in an atomic block
        self.in_atomic_block = False
        self.savepoint_ids = []
        self.needs_rollback = False
        # Reset parameters defining when to close the connection
        max_age = self.settings_dict['CONN_MAX_AGE']
        self.close_at = None if max_age is None else time.time() + max_age
        self.closed_in_transaction = False
        self.errors_occurred = False
        # Establish the connection
        conn_params = self.get_connection_params()
        self.connection = self.get_new_connection(conn_params)
        self.set_autocommit(self.settings_dict['AUTOCOMMIT'])
        self.init_connection_state()
        connection_created.send(sender=self.__class__, connection=self)
 
        self.run_on_commit = []

接続を行っています。connectで呼ばれているメソッドのうち、get_connection_params、get_new_connection、init_connection_stateはBaseDatabaseWrapperではNotImplementedErrorを投げるだけで個々のサブクラスで実装、実際のデータベース接続を行うようになっています。

↑

create_cursor †

create_cursorメソッドもサブクラスで実装すべきメソッドです。というわけで、sqlite3のDatabaseWrapperのcreate_cursor

  1
  2

    def create_cursor(self):
        return self.connection.cursor(factory=SQLiteCursorWrapper)

ややこしいですが、このconnectionというのは先ほどのconnectメソッド中、get_new_connectionメソッドを呼び出して返されたオブジェクトです。今の場合、sqlite3.dbapi2（Databaseという名前でインポートされています）のconnect関数の戻り値のConnectionオブジェクトです。

そのsqliteのConnectionオブジェクトのcursorメソッドを呼び出してカーソルを返しています。もちろんこのカーソルはsqlite3で定義されているカーソルです。

↑

django.db.backends.utils.CursorWrapper †

さて、create_cursorメソッドで個々のDBMSのカーソルが返されました。次にそれをmake_cursorメソッドに渡しています。

 
-
|
!

    def make_cursor(self, cursor):
        """
        Creates a cursor without debug logging.
        """
        return utils.CursorWrapper(cursor, self)

Wrapperもう飽きたよ(笑)ってところですが、ラップされ返されています。

で、そんなこんなで返されたDjangoレベルでのカーソルオブジェクトのexecuteメソッドが呼び出されます。

    def execute(self, sql, params=None):
        self.db.validate_no_broken_transaction()
        with self.db.wrap_database_errors:
            if params is None:
                return self.cursor.execute(sql)
            else:
                return self.cursor.execute(sql, params)

実際にはさらにここでSQLiteCursorWrapperのexecuteが呼ばれ、sqlite3のCursorオブジェクトのexecuteが呼ばれる、ということになりますがもういいでしょう。ともかくこのようにして作成されたSQLが実行されます。

↑

おわりに †

今回はデータベースマイグレーションの様子、モデルに対するテーブル作成のSQLがどう実行されるのかを見てきました。やっていることとしては大枠・共通の処理を記述した基底クラスと、個々のDBMSに対応したサブクラスという教科書的な実装になっていました。基底クラスの処理を眺めるときもサブクラス（実際）がどうなっているのかをチェックしないといけないのが少し面倒ですが難しい処理はそんなにありません。