Django/migrate時の処理を読む（流れ）

はじめに
django/core/management/commands/migrate.py
django/db
django/db/migrations/loader.py
django/db/migrations/executor.py
おわりに

はじめに †

やっとチュートリアルがその2になりました。しばらく読んでない間に1.11が出てしまったのですが(汗)、読解は引き続き1.10を使って進めます。

さて、チュートリアルその2の話題はモデルです。モデルの定義、モデルに対応するデータベーステーブルの作成、モデルの操作などが読解対象になります。

まずは、チュートリアルで初めに

$ python manage.py migrate

と打てと書いてあるのでmigrateコマンドの中身を見てDjangoがデータベースにどのようにテーブルを作っていくか見ていきましょう。（ここではまだこれから作成していくアプリのモデルはなく、プロジェクト作成時にデフォルトで設定されているDjango付属のアプリが使うモデルについて処理が行われます）

ちなみに、打ってみると以下のように出力されます。

Operations to perform:
  Apply all migrations: admin, auth, contenttypes, sessions
Running migrations:
  Applying contenttypes.0001_initial... OK
  Applying auth.0001_initial... OK
  Applying admin.0001_initial... OK
  Applying admin.0002_logentry_remove_auto_add... OK
  Applying contenttypes.0002_remove_content_type_name... OK
  Applying auth.0002_alter_permission_name_max_length... OK
  Applying auth.0003_alter_user_email_max_length... OK
  Applying auth.0004_alter_user_username_opts... OK
  Applying auth.0005_alter_user_last_login_null... OK
  Applying auth.0006_require_contenttypes_0002... OK
  Applying auth.0007_alter_validators_add_error_messages... OK
  Applying auth.0008_alter_user_username_max_length... OK
  Applying sessions.0001_initial... OK

↑

django/core/management/commands/migrate.py †

スタート地点はいつものようにCommandクラスが書かれているコマンド名と対応したファイルです。

migrateコマンドのhandleは結構長いです。ざっと見た感じでは以下の処理を行っています。

データベースへの接続
実行するマイグレーションの決定、順序付け
マイグレーションの実行

ひとつずつ見ていきましょう。

↑

django/db †

データベースの接続を行っていると思われる個所は以下のところです。

-
!
 
 
-
!

        # Get the database we're operating from
        db = options['database']
        connection = connections[db]
 
        # Hook for backends needing any database preparation
        connection.prepare_database()

connectionsとローカル変数の辞書のようにしれっと書いてありますがこの実体は、

from django.db import DEFAULT_DB_ALIAS, connections, router, transaction

と、django.dbモジュールの属性です。というわけで、視点をdjango/db/__init__.pyに向けると、

from django.db.utils import (
    DEFAULT_DB_ALIAS, DJANGO_VERSION_PICKLE_KEY, ConnectionHandler,
    ConnectionRouter, DatabaseError, DataError, Error, IntegrityError,
    InterfaceError, InternalError, NotSupportedError, OperationalError,
    ProgrammingError,
)
 
connections = ConnectionHandler()

utils.pyに移ってConnectionHandlerクラスの__getitem__メソッド

    def __getitem__(self, alias):
        if hasattr(self._connections, alias):
            return getattr(self._connections, alias)
 
        self.ensure_defaults(alias)
        self.prepare_test_settings(alias)
        db = self.databases[alias]
        backend = load_backend(db['ENGINE'])
        conn = backend.DatabaseWrapper(db, alias)
        setattr(self._connections, alias, conn)
        return conn

load_backend関数を見るとENGINEとして設定したもののbase、デフォルトのままのsettings.pyならdjango/db/backends/sqlite3/base.pyが読み込まれて返されることがわかります。なお、self.databasesはプロパティでsettings.pyのDATABASESが設定されています。まあこれについてはもう詳しい説明は必要ないでしょう。

で、prepare_databaseで接続を行っているのかと思ったらしてないみたいですね。migrate.pyに戻って先に進むことにしましょう。

↑

django/db/migrations/loader.py †

次は、実行するマイグレーションの決定、です。関係ありそうなところだけ抜き出すと以下のようになります。

-
!
 
-
!
-
!
 
-
!
-
!

        # Work out which apps have migrations and which do not
        executor = MigrationExecutor(connection, self.migration_progress_callback)
        
        # エラー処理と思われるもの省略
        
        # If they supplied command line arguments, work out what they mean.
        target_app_labels_only = True
        if options['app_label'] and options['migration_name']:
            # 省略
        elif options['app_label']:
            # 省略
        else:
            targets = executor.loader.graph.leaf_nodes()
 
        plan = executor.migration_plan(targets)

キーになるのはMigrationExecutorのようです。上にあるimportを見ると、MigrationExecutorクラスはdjango.db.migrations.executorモジュールにあることがわかります。

また、executorの属性としてloaderとありますが、これはexecutorモジュールと同じ階層にあるloaderモジュールで定義されているMigrationLoaderクラスのインスタンスです。

 
-
|
|
!

class MigrationLoader(object):
    """
    Loads migration files from disk, and their status from the database.
 
    Migration files are expected to live in the "migrations" directory of
    an app. Their names are entirely unimportant from a code perspective,
    but will probably follow the 1234_name.py convention.
 
    On initialization, this class will scan those directories, and open and
    read the python files, looking for a class called Migration, which should
    inherit from django.db.migrations.Migration. See
    django.db.migrations.migration for what that looks like.
    
    以下略
    """
 
    def __init__(self, connection, load=True, ignore_no_migrations=False):
        self.connection = connection
        self.disk_migrations = None
        self.applied_migrations = None
        self.ignore_no_migrations = ignore_no_migrations
        if load:
            self.build_graph()

クラスコメントを読むとこのクラスが何をしているのかがわかります。つまり、アプリのmigrationsディレクトリにあるファイルの読み込みと、データベースを読んでマイグレーションの状況を管理しているようです。build_graphとあるので、これらは初期化時に行われているようです。

build_graphメソッドの先頭。

 
-
|
|
|
!
-
!
-
!

    def build_graph(self):
        """
        Builds a migration dependency graph using both the disk and database.
        You'll need to rebuild the graph if you apply migrations. This isn't
        usually a problem as generally migration stuff runs in a one-shot process.
        """
        # Load disk data
        self.load_disk()
        # Load database data
        if self.connection is None:
            self.applied_migrations = set()
        else:
            recorder = MigrationRecorder(self.connection)
            self.applied_migrations = recorder.applied_migrations()

load_diskメソッドでは先ほどクラスコメントにあったように、アプリのmigrationsディレクトリにある各マイグレーション設定ファイルを読み込んでいます。淡々と読んでいるだけなのでコードの説明は省略します。 load_diskメソッドの実行が終わると、

{(アプリ名, マイグレーション名): Migrationインスタンス}

というような辞書オブジェクトdisk_migrationsが構築されます。

load_diskメソッドから返ってくると次はMigrationRecorderクラスを使ってデータベースから適用済みのマイグレーションを取得しています。初回なので実際には空になります。詳細はまた後で、ということになりますが、マイグレーション情報自体もDjangoのモデルを利用して実装が行われています。

さて、これでアプリのマイグレーション情報および、それがどこまで適用されているかがわかりました（繰り返しますが、初回なので実際には何も適用されていません）。次はマイグレーションの依存関係を調べて実行準を示すグラフの構築を行います。replacesが設定されているものはなさそうなのでさくっと省略します。

-
|
!
 
 
 
-
!
-
!
 
-
!
 
-
!
-

        # To start, populate the migration graph with nodes for ALL migrations
        # and their dependencies. Also make note of replacing migrations at this step.
        self.graph = MigrationGraph()
        self.replacements = {}
        for key, migration in self.disk_migrations.items():
            self.graph.add_node(key, migration)
            # Internal (aka same-app) dependencies.
            self.add_internal_dependencies(key, migration)
            # Replacing migrations.
            if migration.replaces:
                self.replacements[key] = migration
        # Add external dependencies now that the internal ones have been resolved.
        for key, migration in self.disk_migrations.items():
            self.add_external_dependencies(key, migration)
        # Carry out replacements where possible.
        for key, migration in self.replacements.items():
            # 省略

各Migrationにはdependenciesで依存（自分よりも前に実行しておくべきマイグレーション）が書かれています。例えば、authモジュールの0002だと以下のような感じ。

class Migration(migrations.Migration):
 
    dependencies = [
        ('auth', '0001_initial'),
    ]

グラフの構築ステップとしては、

各マイグレーションをノードとして追加
インターナル（自アプリ内）の依存を追加。これは通常、0008→0007→0006のように前の番号のマイグレーションに依存するということになります
エクスターナル（別アプリ）の依存を追加。これを別のループで行っているのは依存対象のマイグレーション（ノード）がグラフ内に存在することを保証するためですね。ちなみに、authの0001はcontenttypesモジュールの初めのマイグレーションに依存してるようです

この後、構築したグラフに誤りがないかのバリデーションがされていますがまあそれは置いといて、ちょっと長くなりましたがロード処理は終わりです。

migrate.pyに戻って、今見てるところ再掲、

-
!
 
-
!
-
!
 
-
!
-
!

        # Work out which apps have migrations and which do not
        executor = MigrationExecutor(connection, self.migration_progress_callback)
        
        # エラー処理と思われるもの省略
        
        # If they supplied command line arguments, work out what they mean.
        target_app_labels_only = True
        if options['app_label'] and options['migration_name']:
            # 省略
        elif options['app_label']:
            # 省略
        else:
            targets = executor.loader.graph.leaf_nodes()
 
        plan = executor.migration_plan(targets)

graph.leaf_nodesはなんとなく想像がつくので省略します。で、migration_planの方。これも全部載せていると長くなるので必要なところだけ。

 
-
|
!
 
 
 
 
 
 
-
!
-
|
|
!
-
!

    def migration_plan(self, targets, clean_start=False):
        """
        Given a set of targets, returns a list of (Migration instance, backwards?).
        """
        plan = []
        if clean_start:
            applied = set()
        else:
            applied = set(self.loader.applied_migrations)
        for target in targets:
            # If the target is (app_label, None), that means unmigrate everything
            if target[1] is None:
                # 省略
            # If the migration is already applied, do backwards mode,
            # otherwise do forwards mode.
            elif target in applied:
                # 省略
            else:
                for migration in self.loader.graph.forwards_plan(target):
                    if migration not in applied:
                        plan.append((self.loader.graph.nodes[migration], False))
                        applied.add(migration)
        return plan

forwards_planの中まで追いかけるのはやめますが、コメントなどに書いてあるように、target（ノードの末端、通常、アプリの最後のマイグレーション）を適用するために必要なマイグレーション群（依存マイグレーション）をリストアップし、それを実行するマイグレーション一覧に記録しています。

結局、planには

(Migrationインスタンス, False)

というタプルのリストが格納されることになります。Falseというのは、backward（マイグレーションの巻き戻し）がFalse（つまり、マイグレーションを進める）という意味です。

↑

django/db/migrations/executor.py †

実行するマイグレーションとその順序まで決定できたので残りは実行です。 migrate.pyに戻ってhandleメソッドの続きを見ると、重要そうなのは以下の部分です。

-
!

        pre_migrate_state = executor._create_project_state(with_applied_migrations=True)
        # 省略
        post_migrate_state = executor.migrate(
            targets, plan=plan, state=pre_migrate_state.clone(), fake=fake,
            fake_initial=fake_initial,
        )

executorのmigrateメソッド。いろいろ分岐していますが結局、普通にmigrateする場合は_migrate_all_forwardsが呼ばれます。

 
-
|
|
|
|
!
 
 
-
!
 
 
 
 
 
-
!
-
!
 
-
!
 
 
-
|
!

    def migrate(self, targets, plan=None, state=None, fake=False, fake_initial=False):
        """
        Migrates the database up to the given targets.
 
        Django first needs to create all project states before a migration is
        (un)applied and in a second step run all the database operations.
        """
        if plan is None:
            plan = self.migration_plan(targets)
        # Create the forwards plan Django would follow on an empty database
        full_plan = self.migration_plan(self.loader.graph.leaf_nodes(), clean_start=True)
 
        all_forwards = all(not backwards for mig, backwards in plan)
        all_backwards = all(backwards for mig, backwards in plan)
 
        if not plan:
            # 省略
        elif all_forwards == all_backwards:
            # 省略
        elif all_forwards:
            if state is None:
                # The resulting state should still include applied migrations.
                state = self._create_project_state(with_applied_migrations=True)
            state = self._migrate_all_forwards(state, plan, full_plan, fake=fake, fake_initial=fake_initial)
        else:
            # No need to check for `elif all_backwards` here, as that condition
            # would always evaluate to true.
            state = self._migrate_all_backwards(plan, full_plan, fake=fake)
 
        self.check_replacements()
 
        return state

_migrate_all_forwardsメソッド。

 
-
|
|
!
 
 
 
-
|
|
|
!

    def _migrate_all_forwards(self, state, plan, full_plan, fake, fake_initial):
        """
        Take a list of 2-tuples of the form (migration instance, False) and
        apply them in the order they occur in the full_plan.
        """
        migrations_to_run = {m[0] for m in plan}
        for migration, _ in full_plan:
            if not migrations_to_run:
                # We remove every migration that we applied from these sets so
                # that we can bail out once the last migration has been applied
                # and don't always run until the very end of the migration
                # process.
                break
            if migration in migrations_to_run:
                if 'apps' not in state.__dict__:
                    if self.progress_callback:
                        self.progress_callback("render_start")
                    state.apps  # Render all -- performance critical
                    if self.progress_callback:
                        self.progress_callback("render_success")
                state = self.apply_migration(state, migration, fake=fake, fake_initial=fake_initial)
                migrations_to_run.remove(migration)
 
        return state

各マイグレーションについてapply_migrationメソッドが実行されることでマイグレーションが行われているようです。

 
-
|
!
 
 
 
 
-
!
 
 
 
-
!
 
-
!
 
 
 
 
-
!

    def apply_migration(self, state, migration, fake=False, fake_initial=False):
        """
        Runs a migration forwards.
        """
        if self.progress_callback:
            self.progress_callback("apply_start", migration, fake)
        if not fake:
            if fake_initial:
                # Test to see if this is an already-applied initial migration
                applied, state = self.detect_soft_applied(state, migration)
                if applied:
                    fake = True
            if not fake:
                # Alright, do it normally
                with self.connection.schema_editor(atomic=migration.atomic) as schema_editor:
                    state = migration.apply(state, schema_editor)
        # For replacement migrations, record individual statuses
        if migration.replaces:
            for app_label, name in migration.replaces:
                self.recorder.record_applied(app_label, name)
        else:
            self.recorder.record_applied(migration.app_label, migration.name)
        # Report progress
        if self.progress_callback:
            self.progress_callback("apply_success", migration, fake)
        return state

マイグレーションの肝はここ。

  1
  2

with self.connection.schema_editor(atomic=migration.atomic) as schema_editor:
    state = migration.apply(state, schema_editor)

ですが、今回はmigrateコマンドで各マイグレーションが実行される流れの確認までで止めておいて、具体的に各マイグレーションが適用されデータベースにテーブルが作られる様については改めて自分のアプリのマイグレーションを行う際に見ていきたいと思います。

↑

おわりに †

今回はマイグレーションの流れについて見てきました。各アプリのmigrationsフォルダにあるマイグレーション定義を集めてきて、依存関係を解決して、ひとつずつ実行していくという当たり前と言えば当たり前の処理です。ほーと思ったのはそれらを一つのクラスで行わず、executor, loader, recorder, graphと役割分担ができている点です。（ただ、executorとloader両方でrecorderインスタンス持ってるのは微妙に思いましたが）

そういえばデータベースへの接続、recorderはデータベースに接続するはずだから・・・、うーんここか。自アプリのマイグレーションの時に説明します。明示的に接続するのではなくて必要になった時に接続するというよくあるやり方ですね。