Android8 Zygote源码分析学习笔记（二）

本文分析了Android系统中Zygote服务的核心流程，重点包括：1）Zygote通过init.zygote32.rc配置文件启动，由app_process程序运行；2）ZygoteInit类的main方法创建ZygoteServer并注册监听socket；3）预加载系统资源（类、库等）；4）关键forkSystemServer方法创建系统服务进程的过程，涉及权限设置、资源限制配置等；5）run

g_i_a_o_giao

886人浏览 · 2025-08-25 23:11:31

g_i_a_o_giao · 2025-08-25 23:11:31 发布

前文回顾：

Android8 Zygote源码分析学习笔记（一）：

Android8 Zygote源码分析学习笔记（一）-CSDN博客本文分析了Android系统Zygote服务的启动流程。Zygote通过init.zygote32.rc配置文件启动，由app_process程序以root权限运行，参数中包含"--zygote"和"--start-system-server"标志。源码分析从app_main.cpp的main函数开始，解析参数后创建AppRuntime对象并调用其start方法，最终通过JNI调用ZygoteInit类的main方法。关键步骤包括参数解析、虚拟机启动、JNI方法注册，https://blog.csdn.net/g_i_a_o_giao/article/details/150613561?spm=1001.2014.3001.5501

通过上篇文章的分析，现在已经调用到我们熟悉的java层ZygoteInit类的main方法了。我们再去看一下/android/internal/os目录下的ZygoteInit.java的源码。

首先来看main方法。第一步先是创建了一个ZygoteServer对象，然后防止启动阶段创建线程。然后对传递的参数数组进行解析，然后调用zygoteServer的registerServerSocket注册一个socket并监听。接下来会判断是否需要预加载，加载完成以后会手动GC回收一次内存。然后会调用最重要的forkSystemServer方法，该方法会返回一个Runnable对象，如果不为空则执行它的run方法。最后，会调用zygoteServer对象的runSelectLoop方法返回runnable对象，代表已经fork出应用程序了，然后调用它的run方法。main函数整体流程就是这样的，下面我将会分析其中比较重要的几个方法。

public static void main(String argv[]) {
        ZygoteServer zygoteServer = new ZygoteServer();

        // Mark zygote start. This ensures that thread creation will throw
        // an error.
        // 标记 Zygote 启动。这确保在此阶段线程创建会抛出错误，
        // 防止在 Zygote 初始化过程中意外创建线程，保证 Zygote 状态的确定性
        ZygoteHooks.startZygoteNoThreadCreation();
        // Zygote 进入自己的进程组，使其成为一个独立的进程组领导
        // Zygote goes into its own process group.
        try {
            Os.setpgid(0, 0);
        } catch (ErrnoException ex) {
            throw new RuntimeException("Failed to setpgid(0,0)", ex);
        }

        final Runnable caller;
        try {
            // Report Zygote start time to tron unless it is a runtime restart
            // 如果不是运行时重启（即正常启动），向 tron 报告 Zygote 启动时间
            // 这里涉及到我之前博客分析的，在开机动画启动完成以后，WMS会设置系统属性中的sys.boot_completed为1
            if (!"1".equals(SystemProperties.get("sys.boot_completed"))) {
                MetricsLogger.histogram(null, "boot_zygote_init",
                        (int) SystemClock.elapsedRealtime());
            }
            // 根据进程架构（32位或64位）创建相应的性能跟踪标签
            String bootTimeTag = Process.is64Bit() ? "Zygote64Timing" : "Zygote32Timing";
            TimingsTraceLog bootTimingsTraceLog = new TimingsTraceLog(bootTimeTag,
                    Trace.TRACE_TAG_DALVIK);
            bootTimingsTraceLog.traceBegin("ZygoteInit");
            RuntimeInit.enableDdms();
            // 初始化命令行参数解析变量
            boolean startSystemServer = false;
            String socketName = "zygote";
            String abiList = null;
            boolean enableLazyPreload = false;
            // 解析命令行参数（跳过第一个参数，通常是类名）
            for (int i = 1; i < argv.length; i++) {
                if ("start-system-server".equals(argv[i])) {
                    startSystemServer = true;
                } else if ("--enable-lazy-preload".equals(argv[i])) {
                    enableLazyPreload = true;
                } else if (argv[i].startsWith(ABI_LIST_ARG)) {
                    abiList = argv[i].substring(ABI_LIST_ARG.length());
                } else if (argv[i].startsWith(SOCKET_NAME_ARG)) {
                    socketName = argv[i].substring(SOCKET_NAME_ARG.length());
                } else {
                    throw new RuntimeException("Unknown command line argument: " + argv[i]);
                }
            }

            if (abiList == null) {
                throw new RuntimeException("No ABI list supplied.");
            }
            // 注册socket监听
            zygoteServer.registerServerSocket(socketName);
            // In some configurations, we avoid preloading resources and classes eagerly.
            // In such cases, we will preload things prior to our first fork.
            // 在某些配置中，将避免急切地预加载资源和类
            // 在这种情况下，将在第一次fork之前预加载
            if (!enableLazyPreload) {
                bootTimingsTraceLog.traceBegin("ZygotePreload");
                EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_START,
                    SystemClock.uptimeMillis());
                // 执行预加载（类、资源、共享库等）
                preload(bootTimingsTraceLog);
                EventLog.writeEvent(LOG_BOOT_PROGRESS_PRELOAD_END,
                    SystemClock.uptimeMillis());
                bootTimingsTraceLog.traceEnd(); // ZygotePreload
            } else {
                Zygote.resetNicePriority();
            }

            // Do an initial gc to clean up after startup
            // 执行初始GC以清理启动后的内存
            bootTimingsTraceLog.traceBegin("PostZygoteInitGC");
            gcAndFinalize();
            bootTimingsTraceLog.traceEnd(); // PostZygoteInitGC

            bootTimingsTraceLog.traceEnd(); // ZygoteInit
            // Disable tracing so that forked processes do not inherit stale tracing tags from
            // Zygote.
            // 禁用跟踪，以便fork的进程不会从Zygote继承陈旧的跟踪标签
            Trace.setTracingEnabled(false, 0);

            // Zygote process unmounts root storage spaces.
            Zygote.nativeUnmountStorageOnInit();

            // Set seccomp policy
            Seccomp.setPolicy();
            // 停止Zygote无线程创建模式，现在允许创建线程
            ZygoteHooks.stopZygoteNoThreadCreation();
            // 如果需要启动系统服务器（重要）
            if (startSystemServer) {
                // fork系统服务器进程
                Runnable r = forkSystemServer(abiList, socketName, zygoteServer);

                // {@code r == null} in the parent (zygote) process, and {@code r != null} in the
                // child (system_server) process.
                if (r != null) {
                    r.run(); // 在子进程中运行系统服务器的初始化代码
                    return;  // 子进程在此返回，不会继续执行Zygote的循环
                }
            }

            Log.i(TAG, "Accepting command socket connections");

            // The select loop returns early in the child process after a fork and
            // loops forever in the zygote.
            // select循环在子进程fork后提前返回，
            // 在zygote中永远循环
            caller = zygoteServer.runSelectLoop(abiList);
        } catch (Throwable ex) {
            Log.e(TAG, "System zygote died with exception", ex);
            throw ex;
        } finally {
            zygoteServer.closeServerSocket();
        }

        // We're in the child process and have exited the select loop. Proceed to execute the
        // command.
        if (caller != null) {
            caller.run();
        }
    }

首先就是registerServerSocket方法。首先会根据传递来的参数拼接一个环境变量名，然后获取已经创建的socket的文件描述符，来获取并监听这个socket。

private static final String ANDROID_SOCKET_PREFIX = "ANDROID_SOCKET_";

private LocalServerSocket mServerSocket;

/**
     * Registers a server socket for zygote command connections
     *
     * @throws RuntimeException when open fails
     */
    // 这里传的socketName为zygote
    void registerServerSocket(String socketName) {
        // 确保只初始化一次服务器socket
        if (mServerSocket == null) {
            int fileDesc;
            // 构建完整的环境变量名，这里为"ANDROID_SOCKET_zygote"
            final String fullSocketName = ANDROID_SOCKET_PREFIX + socketName;
            try {
                // 从环境变量中获取文件描述符的值
                String env = System.getenv(fullSocketName);
                fileDesc = Integer.parseInt(env);
            } catch (RuntimeException ex) {
                // 如果环境变量未设置或值无效，抛出异常
                throw new RuntimeException(fullSocketName + " unset or invalid", ex);
            }
            
            try {
                // 创建一个新的文件描述符对象
                FileDescriptor fd = new FileDescriptor();
                // 使用反射或本地方法设置文件描述符的整数值
                fd.setInt$(fileDesc);
                // 使用现有的文件描述符创建本地服务器socket
                // 注意：这里不是新建一个socket，而是使用init进程已经创建好的socket
                // 赋值mServerSocket
                mServerSocket = new LocalServerSocket(fd);
            } catch (IOException ex) {
                throw new RuntimeException(
                        "Error binding to local socket '" + fileDesc + "'", ex);
            }
        }
    }

那么这个socket是什么时候创建的呢？再来回顾一下zygote.rc这个配置文件。在这里，init进程就会创建一个socket放到本地环境中，在registerServerSocket中就是获取并且监听这个socket。

service zygote /system/bin/app_process -Xzygote /system/bin --zygote --start-system-server
class main
priority -20
user root
group root readproc
socket zygote stream 660 root system
onrestart write /sys/android_power/request_state wake
onrestart write /sys/power/state on
onrestart restart audioserver
onrestart restart cameraserver
onrestart restart media
onrestart restart netd
onrestart restart wificond
writepid /dev/cpuset/foreground/tasks

接下来继续分析main函数中另一个比较重要的方法：preload。

在这个方法中主要对类、资源、共享库、openGL等进行预加载。

static void preload(TimingsTraceLog bootTimingsTraceLog) {
        Log.d(TAG, "begin preload");
        bootTimingsTraceLog.traceBegin("BeginIcuCachePinning");
        beginIcuCachePinning();
        bootTimingsTraceLog.traceEnd(); // BeginIcuCachePinning
        bootTimingsTraceLog.traceBegin("PreloadClasses");
        // 预加载类
        preloadClasses();
        bootTimingsTraceLog.traceEnd(); // PreloadClasses
        bootTimingsTraceLog.traceBegin("PreloadResources");
        // 预加载资源
        preloadResources();
        bootTimingsTraceLog.traceEnd(); // PreloadResources
        Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadAppProcessHALs");
        nativePreloadAppProcessHALs();
        Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
        Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadOpenGL");
        // 预加载OpenGL
        preloadOpenGL();
        Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
        // 预加载共享库
        preloadSharedLibraries();
        // 预加载文字资源
        preloadTextResources();
        // Ask the WebViewFactory to do any initialization that must run in the zygote process,
        // for memory sharing purposes.
        WebViewFactory.prepareWebViewInZygote();
        endIcuCachePinning();
        warmUpJcaProviders();
        Log.d(TAG, "end preload");

        sPreloadComplete = true;
    }

因为文章篇幅有限，咱们就只看部分预加载的代码，比如类的预加载。主要是创建一个FileInputStream对象目标路径下的文件按照一定的格式解析并进行预加载。

// 需要预加载的类所在的文件路径
private static final String PRELOADED_CLASSES = "/system/etc/preloaded-classes";

private static void preloadClasses() {
        final VMRuntime runtime = VMRuntime.getRuntime();

        InputStream is;
        try {
            is = new FileInputStream(PRELOADED_CLASSES);
        } catch (FileNotFoundException e) {
            Log.e(TAG, "Couldn't find " + PRELOADED_CLASSES + ".");
            return;
        }

        Log.i(TAG, "Preloading classes...");
        long startTime = SystemClock.uptimeMillis();

        // Drop root perms while running static initializers.
        final int reuid = Os.getuid();
        final int regid = Os.getgid();

        // We need to drop root perms only if we're already root. In the case of "wrapped"
        // processes (see WrapperInit), this function is called from an unprivileged uid
        // and gid.
        boolean droppedPriviliges = false;
        if (reuid == ROOT_UID && regid == ROOT_GID) {
            try {
                Os.setregid(ROOT_GID, UNPRIVILEGED_GID);
                Os.setreuid(ROOT_UID, UNPRIVILEGED_UID);
            } catch (ErrnoException ex) {
                throw new RuntimeException("Failed to drop root", ex);
            }

            droppedPriviliges = true;
        }

        // Alter the target heap utilization.  With explicit GCs this
        // is not likely to have any effect.
        float defaultUtilization = runtime.getTargetHeapUtilization();
        runtime.setTargetHeapUtilization(0.8f);

        try {
            BufferedReader br
                = new BufferedReader(new InputStreamReader(is), 256);

            int count = 0;
            String line;
            while ((line = br.readLine()) != null) {
                // Skip comments and blank lines.
                line = line.trim();
                if (line.startsWith("#") || line.equals("")) {
                    continue;
                }

                Trace.traceBegin(Trace.TRACE_TAG_DALVIK, line);
                try {
                    if (false) {
                        Log.v(TAG, "Preloading " + line + "...");
                    }
                    // Load and explicitly initialize the given class. Use
                    // Class.forName(String, boolean, ClassLoader) to avoid repeated stack lookups
                    // (to derive the caller's class-loader). Use true to force initialization, and
                    // null for the boot classpath class-loader (could as well cache the
                    // class-loader of this class in a variable).
                    Class.forName(line, true, null);
                    count++;
                } catch (ClassNotFoundException e) {
                    Log.w(TAG, "Class not found for preloading: " + line);
                } catch (UnsatisfiedLinkError e) {
                    Log.w(TAG, "Problem preloading " + line + ": " + e);
                } catch (Throwable t) {
                    Log.e(TAG, "Error preloading " + line + ".", t);
                    if (t instanceof Error) {
                        throw (Error) t;
                    }
                    if (t instanceof RuntimeException) {
                        throw (RuntimeException) t;
                    }
                    throw new RuntimeException(t);
                }
                Trace.traceEnd(Trace.TRACE_TAG_DALVIK);
            }

            Log.i(TAG, "...preloaded " + count + " classes in "
                    + (SystemClock.uptimeMillis()-startTime) + "ms.");
        } catch (IOException e) {
            Log.e(TAG, "Error reading " + PRELOADED_CLASSES + ".", e);
        } finally {
            IoUtils.closeQuietly(is);
            // Restore default.
            runtime.setTargetHeapUtilization(defaultUtilization);

            // Fill in dex caches with classes, fields, and methods brought in by preloading.
            Trace.traceBegin(Trace.TRACE_TAG_DALVIK, "PreloadDexCaches");
            runtime.preloadDexCaches();
            Trace.traceEnd(Trace.TRACE_TAG_DALVIK);

            // Bring back root. We'll need it later if we're in the zygote.
            if (droppedPriviliges) {
                try {
                    Os.setreuid(ROOT_UID, ROOT_UID);
                    Os.setregid(ROOT_GID, ROOT_GID);
                } catch (ErrnoException ex) {
                    throw new RuntimeException("Failed to restore root", ex);
                }
            }
        }
    }

那我们就来看看/system/etc/preloaded-classes这个文件。这些都是在系统中常用的几千个类，太多了就不一一讲解了。

太多了省略部分代码

.......................................

android.accounts.IAccountManager$Stub
android.accounts.IAccountManager$Stub$Proxy
android.accounts.IAccountManagerResponse
android.accounts.IAccountManagerResponse$Stub
android.accounts.OnAccountsUpdateListener
android.accounts.OperationCanceledException
android.animation.AnimationHandler
android.animation.AnimationHandler$1

.......................................

接下来分析最为关键的方法forkSystemServer。首先就是进行一系列的权限和配置检验，然后就是解析参数和配置属性，然后就到了最关键的Zygote.forkSystemServer方法，那么去看看Zygote类中的方法。在这个方法中首先为创建进程提供了环境，然后调用了native层的forkSystemServer函数。

private static final String PROPERTY_RUNNING_IN_CONTAINER = "ro.boot.container";

/**
     * Prepare the arguments and forks for the system server process.
     *
     * Returns an {@code Runnable} that provides an entrypoint into system_server code in the
     * child process, and {@code null} in the parent.
     */
    private static Runnable forkSystemServer(String abiList, String socketName,
            ZygoteServer zygoteServer) {
        // 1. 计算并设置系统服务器进程所需的 POSIX 能力（权限）
        // 将这些能力常量转换为位掩码（bitmask）形式
    long capabilities = posixCapabilitiesAsBits(
        OsConstants.CAP_IPC_LOCK,      // 允许锁定内存（防止交换）
        OsConstants.CAP_KILL,          // 允许发送信号给其他进程
        OsConstants.CAP_NET_ADMIN,     // 允许执行网络管理操作
        OsConstants.CAP_NET_BIND_SERVICE, // 允许绑定到特权端口（<1024）
        OsConstants.CAP_NET_BROADCAST, // 允许网络广播
        OsConstants.CAP_NET_RAW,       // 允许使用原始套接字（RAW sockets）
        OsConstants.CAP_SYS_MODULE,    // 允许插入和移除内核模块
        OsConstants.CAP_SYS_NICE,      // 允许提高进程的优先级（nice值）
        OsConstants.CAP_SYS_PTRACE,    // 允许跟踪（ptrace）其他进程
        OsConstants.CAP_SYS_TIME,      // 允许设置系统时钟
        OsConstants.CAP_SYS_TTY_CONFIG, // 允许配置 TTY 设备
        OsConstants.CAP_WAKE_ALARM     // 允许触发唤醒警报
    );
        /* Containers run without this capability, so avoid setting it in that case */
        // 检查是否运行在容器中（如Android Emulator或某些虚拟化环境）
        if (!SystemProperties.getBoolean(PROPERTY_RUNNING_IN_CONTAINER, false)) {
            capabilities |= posixCapabilitiesAsBits(OsConstants.CAP_BLOCK_SUSPEND);
        }

        /* Hardcoded command line to start the system server */
        // 2. 硬编码启动系统服务器的命令行参数
        // 这些参数将直接控制新进程的权限和身份
        String args[] = {
            "--setuid=1000", // 设置用户ID为1000（system用户）
            "--setgid=1000", // 设置组ID为1000（system组）
            // 设置补充组ID，赋予进程访问各种系统资源的权限
            "-- setgroups=1001,1002,1003,1004,1005,1006,1007,1008,1009,1010,1018,1021,1023,1032,3001,3002,3003,3006,3007,3009,3010",
            // 设置权限能力，格式为"permitted,effective"
            "--capabilities=" + capabilities + "," + capabilities,
            "--nice-name=system_server", // 设置进程的友好名称
            "--runtime-args",            // 标记后面的是运行时参数
            "com.android.server.SystemServer", // 要执行的主类
        };
        ZygoteConnection.Arguments parsedArgs = null;

        int pid;

        try {
            // 3. 解析参数并应用调试属性
            parsedArgs = new ZygoteConnection.Arguments(args);
            ZygoteConnection.applyDebuggerSystemProperty(parsedArgs);
            ZygoteConnection.applyInvokeWithSystemProperty(parsedArgs);

            /* Request to fork the system server process */
            /* 请求fork系统服务器进程 */
            // 4. 执行实际的fork操作，创建系统服务器进程
            pid = Zygote.forkSystemServer(
                    parsedArgs.uid, parsedArgs.gid,
                    parsedArgs.gids,
                    parsedArgs.debugFlags,
                    null,
                    parsedArgs.permittedCapabilities,
                    parsedArgs.effectiveCapabilities);
        } catch (IllegalArgumentException ex) {
            throw new RuntimeException(ex);
        }
        
        /* For child process */
        // 对于子进程pid = 0的处理
        if (pid == 0) {
            // 检查是否存在第二个Zygote（如在64位设备上的32位Zygote）
            if (hasSecondZygote(abiList)) {
                // 等待次级Zygote准备就绪，确保所有ABI环境都可用
                waitForSecondaryZygote(socketName);
            }
            // 关闭从Zygote继承的服务器socket
            zygoteServer.closeServerSocket();
            // 处理系统服务器进程的初始化并返回可运行对象
            return handleSystemServerProcess(parsedArgs);
        }

        return null;
    }



Zygote.forkSystemServer:

/**
 * 启动系统服务器进程的特殊方法。除了在 forkAndSpecialize 中执行的常见操作外，
 * 还会记录子进程的 pid，这样子进程的死亡将导致 zygote 退出。
 *
 * @param uid 新进程在 fork() 后、生成任何线程前应 setuid() 到的 UNIX uid。
 * @param gid 新进程在 fork() 后、生成任何线程前应 setgid() 到的 UNIX gid。
 * @param gids 可以为 null；新进程在 fork 后、生成任何线程前应 setgroups() 到的 UNIX gid 列表。
 * @param debugFlags 启用调试功能的位标志。
 * @param rlimits 可以为 null；rlimit 元组的数组，第二维长度为 3，表示 (resource, rlim_cur, rlim_max)。
 *                这些通过 posix setrlimit(2) 调用设置。
 * @param permittedCapabilities setcap() 的参数
 * @param effectiveCapabilities setcap() 的参数
 *
 * @return 如果当前是子进程则返回 0，如果是父进程则返回子进程的 pid，错误时返回 -1。
 */
public static int forkSystemServer(int uid, int gid, int[] gids, int debugFlags,
        int[][] rlimits, long permittedCapabilities, long effectiveCapabilities) {
    
    // 1. 通知虚拟机钩子（VM_HOOKS）即将进行 fork 操作
    // 这通常用于在 fork 前执行一些清理工作，如暂停垃圾收集、终止编译器线程等
    // 目的是确保虚拟机处于一个安全、一致的状态以便 fork
    VM_HOOKS.preFork();
    
    // 2. 重置 zygote 进程的 nice 优先级
    // 确保 fork 操作不会受到优先级调整的影响
    resetNicePriority();
    
    // 3. 调用 Native 方法执行实际的系统服务器 fork 操作
    // 这是最核心的步骤，创建系统服务器进程
    int pid = nativeForkSystemServer(
            uid, gid, gids, debugFlags, rlimits, 
            permittedCapabilities, effectiveCapabilities);
    
    // 4. 如果当前是子进程（系统服务器），启用跟踪功能
    if (pid == 0) {
        Trace.setTracingEnabled(true, debugFlags);
    }
    
    // 5. 通知虚拟机钩子 fork 操作已完成
    // 这通常用于在 fork 后执行一些恢复工作，如恢复垃圾收集等
    // 在父子进程中都会执行
    VM_HOOKS.postForkCommon();
    
    return pid; // 返回 fork 的结果
}

// 6. 声明 Native 方法，实际逻辑 C/C++ 层
native private static int nativeForkSystemServer(
        int uid, int gid, int[] gids, int debugFlags,
        int[][] rlimits, long permittedCapabilities, 
        long effectiveCapabilities);

那么通过grep查询到在rameworks/base/core/jni/com_android_internal_os_Zygote.cpp中有一个com_android_internal_os_Zygote_nativeForkSystemServer方法正好是JNI的并且是用来创建进程的。在这个方法中，又去调用ForkAndSpecializeCommon方法来生成进程，然后处理生成的结果。

static jint com_android_internal_os_Zygote_nativeForkSystemServer(
        JNIEnv* env, jclass, uid_t uid, gid_t gid, jintArray gids,
        jint debug_flags, jobjectArray rlimits, jlong permittedCapabilities,
        jlong effectiveCapabilities) {
  // 1. 调用通用 fork 和专门化函数，创建系统服务器进程
  // 注意最后一个参数为 true，表示这是系统服务器进程
  pid_t pid = ForkAndSpecializeCommon(env, uid, gid, gids,
                                      debug_flags, rlimits,
                                      permittedCapabilities, effectiveCapabilities,
                                      MOUNT_EXTERNAL_DEFAULT, NULL, NULL, true, NULL,
                                      NULL, NULL, NULL);
  if (pid > 0) {
      // The zygote process checks whether the child process has died or not.
      ALOGI("System server process %d has been created", pid);
      // 将系统服务器的进程ID存储到全局变量中
      // 这样其他代码可以访问并监控系统服务器的状态
      gSystemServerPid = pid;
      // There is a slight window that the system server process has crashed
      // but it went unnoticed because we haven't published its pid yet. So
      // we recheck here just to make sure that all is well.
      int status;
      // 关键安全检查：处理竞争条件
      // 在 fork 完成后和发布 pid 之前有一个微小的时间窗口，
      // 系统服务器进程可能已经崩溃但未被察觉。
      // 这里使用非阻塞方式检查进程状态，确保一切正常。
      if (waitpid(pid, &status, WNOHANG) == pid) {
          ALOGE("System server process %d has died. Restarting Zygote!", pid);
          RuntimeAbort(env, __LINE__, "System server process has died. Restarting Zygote!");
      }
      // 内存管理：将系统服务器分配到正确的内存控制组（cgroup）
      // 这是 Android 内存管理系统的关键部分，用于限制和监控系统资源使用
      // Assign system_server to the correct memory cgroup.
      if (!WriteStringToFile(StringPrintf("%d", pid), "/dev/memcg/system/tasks")) {
        ALOGE("couldn't write %d to /dev/memcg/system/tasks", pid);
      }
  }
  // 返回 fork 的结果
  // - 在父进程（Zygote）中：返回子进程的 PID（>0）
  // - 在子进程（系统服务器）中：返回 0
  // - 如果出错：返回 -1
  return pid;
}

那么再来分析一下ForkAndSpecializeCommon方法。大部分都是对信号的处理和异常处理，最重要的就是执行fork方法孵化一个线程。如果孵化出来的pid为0，代表这是子进程，需要进行初始化以及一系列配置，孵化出来的pid大于0，代表这是父进程，那么只需要等待下一次孵化就可以，不做处理。

// Utility routine to fork zygote and specialize the child process.
// 这里的is_system_server为true
static pid_t ForkAndSpecializeCommon(JNIEnv* env, uid_t uid, gid_t gid, jintArray javaGids,
                                     jint debug_flags, jobjectArray javaRlimits,
                                     jlong permittedCapabilities, jlong effectiveCapabilities,
                                     jint mount_external,
                                     jstring java_se_info, jstring java_se_name,
                                     bool is_system_server, jintArray fdsToClose,
                                     jintArray fdsToIgnore,
                                     jstring instructionSet, jstring dataDir) {
  SetSigChldHandler();

  sigset_t sigchld;
  sigemptyset(&sigchld);
  sigaddset(&sigchld, SIGCHLD);

  // Temporarily block SIGCHLD during forks. The SIGCHLD handler might
  // log, which would result in the logging FDs we close being reopened.
  // This would cause failures because the FDs are not whitelisted.
  //
  // Note that the zygote process is single threaded at this point.
  if (sigprocmask(SIG_BLOCK, &sigchld, nullptr) == -1) {
    ALOGE("sigprocmask(SIG_SETMASK, { SIGCHLD }) failed: %s", strerror(errno));
    RuntimeAbort(env, __LINE__, "Call to sigprocmask(SIG_BLOCK, { SIGCHLD }) failed.");
  }

  // Close any logging related FDs before we start evaluating the list of
  // file descriptors.
  __android_log_close();

  // If this is the first fork for this zygote, create the open FD table.
  // If it isn't, we just need to check whether the list of open files has
  // changed (and it shouldn't in the normal case).
  std::vector<int> fds_to_ignore;
  FillFileDescriptorVector(env, fdsToIgnore, &fds_to_ignore);
  if (gOpenFdTable == NULL) {
    gOpenFdTable = FileDescriptorTable::Create(fds_to_ignore);
    if (gOpenFdTable == NULL) {
      RuntimeAbort(env, __LINE__, "Unable to construct file descriptor table.");
    }
  } else if (!gOpenFdTable->Restat(fds_to_ignore)) {
    RuntimeAbort(env, __LINE__, "Unable to restat file descriptor table.");
  }
  // 上述代码不进行分析了，主要和信号以及异常处理有关系
  // 执行 fork 系统调用，创建子进程（最重要）。
  pid_t pid = fork();
  // 子进程处理分支
  if (pid == 0) {
    // 应用初始化前的准备工作
    PreApplicationInit();

    // Clean up any descriptors which must be closed immediately
    DetachDescriptors(env, fdsToClose);

    // Re-open all remaining open file descriptors so that they aren't shared
    // with the zygote across a fork.
    if (!gOpenFdTable->ReopenOrDetach()) {
      RuntimeAbort(env, __LINE__, "Unable to reopen whitelisted descriptors.");
    }

    if (sigprocmask(SIG_UNBLOCK, &sigchld, nullptr) == -1) {
      ALOGE("sigprocmask(SIG_SETMASK, { SIGCHLD }) failed: %s", strerror(errno));
      RuntimeAbort(env, __LINE__, "Call to sigprocmask(SIG_UNBLOCK, { SIGCHLD }) failed.");
    }

    // Keep capabilities across UID change, unless we're staying root.
    if (uid != 0) {
      EnableKeepCapabilities(env);
    }

    SetInheritable(env, permittedCapabilities);
    DropCapabilitiesBoundingSet(env);

    bool use_native_bridge = !is_system_server && (instructionSet != NULL)
        && android::NativeBridgeAvailable();
    if (use_native_bridge) {
      ScopedUtfChars isa_string(env, instructionSet);
      use_native_bridge = android::NeedsNativeBridge(isa_string.c_str());
    }
    if (use_native_bridge && dataDir == NULL) {
      // dataDir should never be null if we need to use a native bridge.
      // In general, dataDir will never be null for normal applications. It can only happen in
      // special cases (for isolated processes which are not associated with any app). These are
      // launched by the framework and should not be emulated anyway.
      use_native_bridge = false;
      ALOGW("Native bridge will not be used because dataDir == NULL.");
    }

    if (!MountEmulatedStorage(uid, mount_external, use_native_bridge)) {
      ALOGW("Failed to mount emulated storage: %s", strerror(errno));
      if (errno == ENOTCONN || errno == EROFS) {
        // When device is actively encrypting, we get ENOTCONN here
        // since FUSE was mounted before the framework restarted.
        // When encrypted device is booting, we get EROFS since
        // FUSE hasn't been created yet by init.
        // In either case, continue without external storage.
      } else {
        RuntimeAbort(env, __LINE__, "Cannot continue without emulated storage");
      }
    }
    // 如果不是系统服务器，创建进程组
    if (!is_system_server) {
        int rc = createProcessGroup(uid, getpid());
        if (rc != 0) {
            if (rc == -EROFS) {
                ALOGW("createProcessGroup failed, kernel missing CONFIG_CGROUP_CPUACCT?");
            } else {
                ALOGE("createProcessGroup(%d, %d) failed: %s", uid, pid, strerror(-rc));
            }
        }
    }
    // 设置组ID
    SetGids(env, javaGids);
    // 设置资源限制
    SetRLimits(env, javaRlimits);

    if (use_native_bridge) {
      ScopedUtfChars isa_string(env, instructionSet);
      ScopedUtfChars data_dir(env, dataDir);
      android::PreInitializeNativeBridge(data_dir.c_str(), isa_string.c_str());
    }
    // 设置真实、有效和保存的组ID
    int rc = setresgid(gid, gid, gid);
    if (rc == -1) {
      ALOGE("setresgid(%d) failed: %s", gid, strerror(errno));
      RuntimeAbort(env, __LINE__, "setresgid failed");
    }

    rc = setresuid(uid, uid, uid);
    if (rc == -1) {
      ALOGE("setresuid(%d) failed: %s", uid, strerror(errno));
      RuntimeAbort(env, __LINE__, "setresuid failed");
    }

    if (NeedsNoRandomizeWorkaround()) {
        // Work around ARM kernel ASLR lossage (http://b/5817320).
        int old_personality = personality(0xffffffff);
        int new_personality = personality(old_personality | ADDR_NO_RANDOMIZE);
        if (new_personality == -1) {
            ALOGW("personality(%d) failed: %s", new_personality, strerror(errno));
        }
    }
    // 设置进程的capabilities
    SetCapabilities(env, permittedCapabilities, effectiveCapabilities, permittedCapabilities);
    // 设置调度策略
    SetSchedulerPolicy(env);

    const char* se_info_c_str = NULL;
    ScopedUtfChars* se_info = NULL;
    if (java_se_info != NULL) {
        se_info = new ScopedUtfChars(env, java_se_info);
        se_info_c_str = se_info->c_str();
        if (se_info_c_str == NULL) {
          RuntimeAbort(env, __LINE__, "se_info_c_str == NULL");
        }
    }
    const char* se_name_c_str = NULL;
    ScopedUtfChars* se_name = NULL;
    if (java_se_name != NULL) {
        se_name = new ScopedUtfChars(env, java_se_name);
        se_name_c_str = se_name->c_str();
        if (se_name_c_str == NULL) {
          RuntimeAbort(env, __LINE__, "se_name_c_str == NULL");
        }
    }
    rc = selinux_android_setcontext(uid, is_system_server, se_info_c_str, se_name_c_str);
    if (rc == -1) {
      ALOGE("selinux_android_setcontext(%d, %d, \"%s\", \"%s\") failed", uid,
            is_system_server, se_info_c_str, se_name_c_str);
      RuntimeAbort(env, __LINE__, "selinux_android_setcontext failed");
    }

    // Make it easier to debug audit logs by setting the main thread's name to the
    // nice name rather than "app_process".
    if (se_info_c_str == NULL && is_system_server) {
      se_name_c_str = "system_server";
    }
    if (se_info_c_str != NULL) {
      SetThreadName(se_name_c_str);
    }

    delete se_info;
    delete se_name;

    UnsetSigChldHandler();
    // 调用Java层的post-fork子进程钩子函数
    env->CallStaticVoidMethod(gZygoteClass, gCallPostForkChildHooks, debug_flags,
                              is_system_server, instructionSet);
    if (env->ExceptionCheck()) {
      RuntimeAbort(env, __LINE__, "Error calling post fork hooks.");
    }
  } else if (pid > 0) {
    // the parent process

    // We blocked SIGCHLD prior to a fork, we unblock it here.
    if (sigprocmask(SIG_UNBLOCK, &sigchld, nullptr) == -1) {
      ALOGE("sigprocmask(SIG_SETMASK, { SIGCHLD }) failed: %s", strerror(errno));
      RuntimeAbort(env, __LINE__, "Call to sigprocmask(SIG_UNBLOCK, { SIGCHLD }) failed.");
    }
  }
  return pid;
}

至此一个进程就创建完毕了。那么后续的处理系统服务器进程的初始化并返回可运行对象
handleSystemServerProcess方法将在后续进行学习并讲解。

然后再返回到最开始分析的ZygoteInit.java中的main函数，还有最后一个重要的方法没有分析，就是runSelectLoop方法。这个循环是Android系统启动新应用进程的核心机制，Zygote进程通过它接收创建新进程的请求并fork子进程。那么来看看这个方法，它实现了多路复用的socket连接处理机制，使用poll同时监听多个描述符。当接收到socket有新的连接请求时，会接受连接并且调用acceptCommanPeer创建一个ZygoteConnection对象，然后添加到列表中。接下来会调用ZygoteConnection的progressOneCommand方法处理客户端的指令。

// 创建一个Runnable任务，用于处理多路复用的Socket连接
Runnable runSelectLoop(String abiList) {
    // 存储所有需要监听的文件描述符（包括服务器Socket和客户端连接）
    ArrayList<FileDescriptor> fds = new ArrayList<FileDescriptor>();
    // 存储与文件描述符对应的Zygote连接对象
    ArrayList<ZygoteConnection> peers = new ArrayList<ZygoteConnection>();

    // 添加服务器Socket的文件描述符到监听列表
    fds.add(mServerSocket.getFileDescriptor());
    // 服务器Socket没有对应的Zygote连接对象，所以添加null
    peers.add(null);

    // 无限循环，持续监听和处理连接
    while (true) {
        // 创建pollfd结构数组，用于系统调用poll
        StructPollfd[] pollFds = new StructPollfd[fds.size()];
        for (int i = 0; i < pollFds.length; ++i) {
            pollFds[i] = new StructPollfd();
            pollFds[i].fd = fds.get(i); // 设置文件描述符
            pollFds[i].events = (short) POLLIN; // 监听读取事件
        }
        try {
            // 调用poll系统调用，无限期等待事件发生，一直阻塞
            Os.poll(pollFds, -1);
        } catch (ErrnoException ex) {
            throw new RuntimeException("poll failed", ex);
        }
        
        // 从后向前遍历所有文件描述符（这样在删除元素时不会影响索引）
        for (int i = pollFds.length - 1; i >= 0; --i) {
            // 检查是否有读取事件发生，如果没有则跳过
            if ((pollFds[i].revents & POLLIN) == 0) {
                continue;
            }

            // 索引0对应的是服务器Socket
            if (i == 0) {
                // 接受新的客户端连接
                ZygoteConnection newPeer = acceptCommandPeer(abiList);
                // 将新连接添加到列表中
                peers.add(newPeer);
                fds.add(newPeer.getFileDesciptor());
            } else {
                // 处理客户端连接
                try {
                    // 获取对应的Zygote连接
                    ZygoteConnection connection = peers.get(i);
                    // 处理一个命令，可能返回一个Runnable对象（在子进程中执行）
                    final Runnable command = connection.processOneCommand(this);

                    // 检查是否在子进程中（fork后）
                    if (mIsForkChild) {
                        // 在子进程中，应该总是有命令要执行
                        if (command == null) {
                            throw new IllegalStateException("command == null");
                        }

                        // 返回命令给调用者执行
                        return command;
                    } else {
                        // 在服务器进程中，不应该有命令要执行
                        if (command != null) {
                            throw new IllegalStateException("command != null");
                        }

                        // 检查连接是否被对端关闭
                        if (connection.isClosedByPeer()) {
                            // 关闭Socket并清理资源
                            connection.closeSocket();
                            peers.remove(i);
                            fds.remove(i);
                        }
                    }
                } catch (Exception e) {
                    if (!mIsForkChild) {
                        // 在服务器进程中的异常处理
                        // 记录错误日志
                        Slog.e(TAG, "Exception executing zygote command: ", e);

                        // 获取异常连接并清理资源
                        ZygoteConnection conn = peers.remove(i);
                        conn.closeSocket();
                        fds.remove(i);
                    } else {
                        // 在子进程中的异常处理（fork后但在执行main方法前）
                        Log.e(TAG, "Caught post-fork exception in child process.", e);
                        throw e; // 抛出异常，终止子进程
                    }
                }
            }
        }
    }
}



acceptCommandPeer方法：创建一个ZygoteConnection
private ZygoteConnection acceptCommandPeer(String abiList) {
        try {
            return createNewConnection(mServerSocket.accept(), abiList);
        } catch (IOException ex) {
            throw new RuntimeException(
                    "IOException during accept()", ex);
        }
    }

那么来看看这个progressOneCommand方法。在这个方法中主要进行参数的处理和解析，然后进行一系列异常和限制的处理，最后又看到了有点熟悉的Zygote.forkAndSpecialize方法。

Runnable processOneCommand(ZygoteServer zygoteServer) {
        String args[];
        Arguments parsedArgs = null;
        FileDescriptor[] descriptors;

        try {
            // 从socket中读取参数
            args = readArgumentList();
            // 获取辅助文件描述符（用于进程间传递文件描述符）
            descriptors = mSocket.getAncillaryFileDescriptors();
        } catch (IOException ex) {
            throw new IllegalStateException("IOException on command socket", ex);
        }

        // readArgumentList returns null only when it has reached EOF with no available
        // data to read. This will only happen when the remote socket has disconnected.
        if (args == null) {
            isEof = true;
            return null;
        }

        int pid = -1;
        FileDescriptor childPipeFd = null;
        FileDescriptor serverPipeFd = null;
        // 解析参数
        parsedArgs = new Arguments(args);

        if (parsedArgs.abiListQuery) {
            handleAbiListQuery();
            return null;
        }

        if (parsedArgs.preloadDefault) {
            handlePreload();
            return null;
        }

        if (parsedArgs.preloadPackage != null) {
            handlePreloadPackage(parsedArgs.preloadPackage, parsedArgs.preloadPackageLibs,
                    parsedArgs.preloadPackageCacheKey);
            return null;
        }

        if (parsedArgs.permittedCapabilities != 0 || parsedArgs.effectiveCapabilities != 0) {
            throw new ZygoteSecurityException("Client may not specify capabilities: " +
                    "permitted=0x" + Long.toHexString(parsedArgs.permittedCapabilities) +
                    ", effective=0x" + Long.toHexString(parsedArgs.effectiveCapabilities));
        }

        applyUidSecurityPolicy(parsedArgs, peer);
        applyInvokeWithSecurityPolicy(parsedArgs, peer);

        applyDebuggerSystemProperty(parsedArgs);
        applyInvokeWithSystemProperty(parsedArgs);

        int[][] rlimits = null;

        if (parsedArgs.rlimits != null) {
            rlimits = parsedArgs.rlimits.toArray(intArray2d);
        }

        int[] fdsToIgnore = null;

        if (parsedArgs.invokeWith != null) {
            try {
                FileDescriptor[] pipeFds = Os.pipe2(O_CLOEXEC);
                childPipeFd = pipeFds[1];
                serverPipeFd = pipeFds[0];
                Os.fcntlInt(childPipeFd, F_SETFD, 0);
                fdsToIgnore = new int[]{childPipeFd.getInt$(), serverPipeFd.getInt$()};
            } catch (ErrnoException errnoEx) {
                throw new IllegalStateException("Unable to set up pipe for invoke-with", errnoEx);
            }
        }

        /**
         * In order to avoid leaking descriptors to the Zygote child,
         * the native code must close the two Zygote socket descriptors
         * in the child process before it switches from Zygote-root to
         * the UID and privileges of the application being launched.
         *
         * In order to avoid "bad file descriptor" errors when the
         * two LocalSocket objects are closed, the Posix file
         * descriptors are released via a dup2() call which closes
         * the socket and substitutes an open descriptor to /dev/null.
         */

        int [] fdsToClose = { -1, -1 };

        FileDescriptor fd = mSocket.getFileDescriptor();

        if (fd != null) {
            fdsToClose[0] = fd.getInt$();
        }

        fd = zygoteServer.getServerSocketFileDescriptor();

        if (fd != null) {
            fdsToClose[1] = fd.getInt$();
        }

        fd = null;
        // 执行fork和专门化操作（核心步骤）
        pid = Zygote.forkAndSpecialize(parsedArgs.uid, parsedArgs.gid, parsedArgs.gids,
                parsedArgs.debugFlags, rlimits, parsedArgs.mountExternal, parsedArgs.seInfo,
                parsedArgs.niceName, fdsToClose, fdsToIgnore, parsedArgs.instructionSet,
                parsedArgs.appDataDir);

        try {
            if (pid == 0) {
                // in child
                // 标记为fork出的子进程
                zygoteServer.setForkChild();

                zygoteServer.closeServerSocket();
                IoUtils.closeQuietly(serverPipeFd);
                serverPipeFd = null;
                // 处理子进程逻辑并返回要执行的Runnable
                return handleChildProc(parsedArgs, descriptors, childPipeFd);
            } else {
                // In the parent. A pid < 0 indicates a failure and will be handled in
                // handleParentProc.
                
                IoUtils.closeQuietly(childPipeFd);
                childPipeFd = null;
                // 在父进程中。pid < 0表示失败，将在handleParentProc中处理
                handleParentProc(pid, descriptors, serverPipeFd);
                return null;
            }
        } finally {
            IoUtils.closeQuietly(childPipeFd);
            IoUtils.closeQuietly(serverPipeFd);
        }
    }

再到Zygote类中查看一下forkAndSpecialize方法。在这个方法中，我们看到了熟悉的nativeForkAndSpecialize方法，又调用到了c++层面的fork方法。至此我们可以知道，当Zygote进程被创建以后，会进入一个死循环，在这个死循环中通过多路复用机制处理多个socket的连接，当某个socket发送创建进程的消息以后，会解析参数并且调用native层的fork方法孵化一个子进程并标记。如果返回的pid大于0，那么就会对子进程进行一系列配置

 public static int forkAndSpecialize(int uid, int gid, int[] gids, int debugFlags,
          int[][] rlimits, int mountExternal, String seInfo, String niceName, int[] fdsToClose,
          int[] fdsToIgnore, String instructionSet, String appDataDir) {
        VM_HOOKS.preFork();
        // Resets nice priority for zygote process.
        resetNicePriority();
        int pid = nativeForkAndSpecialize(
                  uid, gid, gids, debugFlags, rlimits, mountExternal, seInfo, niceName, fdsToClose,
                  fdsToIgnore, instructionSet, appDataDir);
        // Enable tracing as soon as possible for the child process.
        if (pid == 0) {
            Trace.setTracingEnabled(true, debugFlags);

            // Note that this event ends at the end of handleChildProc,
            Trace.traceBegin(Trace.TRACE_TAG_ACTIVITY_MANAGER, "PostFork");
        }
        VM_HOOKS.postForkCommon();
        return pid;
    }

至此Zygote的服务是怎么创建并且fork一个进程以及监听socket来fork进程的全部流程和源码已经分析清楚了。可以看到在调用native层fork进程以后，最终都会在java层返回一个runnable对象。这个对象的作用将在后续进行分析。

2048 AI社区

有“AI”的1024 = 2048，欢迎大家加入2048 AI社区

更多推荐

896章:人工智能的定义

人工智能（Artificial Intelligence, AI）是通过计算机系统模拟人类智能的技术，涵盖学习、推理、感知、决策等能力。其核心目标是使机器能够执行复杂任务，如自然语言处理、图像识别和自动化决策。

2048 AI社区

876章:AI 的定义

人工智能（Artificial Intelligence，简称 AI）指通过计算机系统模拟人类智能的技术，包括学习、推理、问题解决、感知和语言理解等能力。

2048 AI社区

一文带你了解大模型的RAG(检索

然而，当我们在提示大模型生成训练数据之外的知识时，例如最新知识、特定领域知识等，LLM的输出可能会导致事实不准确，这就是我们常说的模型幻觉book.douban.com/review/17047761?今天给大家分享的这篇文章，将介绍RAG的概念理论，并带大家利用LangChain进行编排，OpenAI语言模型、Weaviate 矢量数据库（也可以自己搭建Milvus向量数据库）来实现简单的 RA