The following is the base model of Apollo.
The following figure provides an overview of Apollo's architecture modules. For a detailed description, you can refer to Apollo Configuration Center Architecture Anatomy.
The above diagram briefly describes the general design of Apollo, which we can see from bottom to top.
Config Service provides configuration reading, pushing, etc., and serves Apollo clients
sequenceDiagram
Client ->> Config Service: request
Config Service ->> ConfigDB: request
ConfigDB -->> Config Service: ack
Config Service -->> Client: ack
Admin Service provides configuration modification, publishing and other functions, the service object is Apollo Portal (management interface).
sequenceDiagram
Portal ->> Admin Service: r/w, publish appId/cluster/namespace
Admin Service ->> ConfigDB: r/w, publish appId/cluster/namespace
ConfigDB -->> Admin Service: ack
Admin Service -->> Portal: ack
Config Service and Admin Service are both multi-instance, stateless deployments, so they need to register themselves with Eureka and keep a heartbeat
On top of Eureka we erected a layer of Meta Server to encapsulate Eureka's service discovery interface
sequenceDiagram
Client or Portal ->> Meta Server: discovery service's instances
Meta Server ->> Eureka: discovery service's instances
Eureka -->> Meta Server: service's instances
Meta Server -->> Client or Portal: service's instances
Client accesses Meta Server through domain name to get Config Service service list (IP+Port), and then accesses the service directly through IP+Port, and at the same time will do load balance, error retry on Client side.
sequenceDiagram
Client ->> Meta Server: discovery Config Service's instances
Meta Server -->> Client: Config Service's instances(Multiple IP+Port)
loop until success
Client ->> Client: load balance choose a Config Service instance
Client ->> Config Service: request
Config Service -->> Client: ack
end
Portal accesses Meta Server through domain name to get Admin Service service list (IP+Port), and then directly accesses the service through IP+Port, and at the same time will do load balance, error retry on Portal side.
sequenceDiagram
Portal ->> Meta Server: discovery Admin Service's instances
Meta Server -->> Portal: Admin Service's instances(Multiple IP+Port)
loop until success
Portal ->> Portal: load balance choose a Admin Service instance
Portal ->> Config Service: request
Config Service -->> Portal: ack
end
To simplify deployment, we will actually deploy the three logical roles Config Service, Eureka and Meta Server in the same JVM process.
graph
subgraph JVM Process
1[Config Service]
2[Eureka]
3[Meta Server]
end
The actual deployment architecture can be found in deployment-architecture
Why do we use Eureka as a service registry instead of the traditional zk and etcd? I have roughly summarized the reasons as follows.
Provides configuration acquisition interface
sequenceDiagram
Client ->> Config Service: get content of appId/cluster/namespace
opt if namespace is not cached
Config Service ->> ConfigDB: get content of appId/cluster/namespace
ConfigDB -->> Config Service: content of appId/cluster/namespace
end
Config Service -->> Client: content of appId/cluster/namespace
provide configuration update push interface (based on Http long polling)
Interface service object is Apollo client
An important feature in the configuration center is the real-time push to the client after the configuration is published. Here we briefly look at how this piece is designed to be implemented.
An important feature in the configuration center is the real-time push to the client after the configuration is published. Let's take a brief look at how this piece is designed to be implemented.
The above diagram briefly describes the general process of a configuration release.
After the configuration is released, Admin Service needs to notify all Config Service that there is a configuration release, so that Config Service can notify the corresponding client to pull the latest configuration.
Conceptually, this is a typical messaging scenario where Admin Service acts as a producer to send out messages and each Config Service acts as a consumer to consume the messages. The decoupling of Admin Service and Config Service can be well achieved by a Message Queue component.
In terms of implementation, considering the actual usage scenario of Apollo and in order to minimize external dependencies, we did not use external messaging middleware, but implemented a simple message queue through the database.
The implementation is as follows.
The schematic diagram is as follows.
The previous section briefly described how NotificationControllerV2 learns that a configuration has been released, but how does NotificationControllerV2 notify the client when it learns that a configuration has been released?
The implementation is as follows.
notifications/v2
interface of the Config Service, which is NotificationControllerV2, see RemoteConfigLongPollServiceThe above diagram briefly describes the principle of Apollo client implementation.
apollo.refreshInterval
at runtime, in minutes.Apollo not only supports API to get the configuration, but also supports integration with Spring/Spring Boot, the integration principle is briefly described as follows.
Spring has added ConfigurableEnvironment
and PropertySource
since version 3.1.
The structure at runtime looks like this.
Note that there is an order of priority between PropertySource, if there is a Key present in more than one property source, then the property source in front of it takes precedence.
So for the above example.
With the above principles understood, the means of integrating Apollo with Spring/Spring Boot comes into play: during the application startup phase, Apollo fetches the configuration from the remote end, then assembles it into a PropertySource and inserts it into the first one, as shown in the following diagram.
The related code can be found in PropertySourcesProcessor
Scene | Impact | Downgrade | reason |
---|---|---|---|
A Config Service goes offline | No effect | Config Service is stateless, the client reconnects to other Config Service | |
All Config Services offline | Client cannot read the latest configuration, Portal has no effect | When the client restarts, the local cache configuration file can be read. If it is a newly expanded machine, you can obtain the cached configuration file from other machines. For details, please refer to Java Client Usage Guide - 1.2.3 Local Cache Path | |
A certain Admin Service goes offline | No effect | Admin Service is stateless, Portal reconnects to other Admin Service | |
All Admin Services are offline | The client is not affected, Portal cannot update the configuration | ||
A Portal goes offline | No effect | Portal domain name binds multiple servers through SLB, and points to an available server after retrying | |
All Portals offline | The client is not affected, Portal cannot update the configuration | ||
A data center goes offline | No effect | Multiple data centers are deployed, data is fully synchronized, and Meta Server/Portal domain names are automatically switched to other surviving data centers through SLB | |
Database down | The client is not affected, Portal cannot update the configuration | After the Config Service is enabled configuration cache, read the configuration Fetch is not affected by database downtime |
Apollo client and server currently support CAT automatic management, so if your company has deployed CAT internally, Apollo will automatically enable CAT management as long as cat-client is introduced .
If you don't use CAT, don't worry, as long as cat-client is not introduced, Apollo will not enable CAT management.
Apollo also provides Tracer-related SPI, which can be easily connected to its own company's monitoring system.
For more information, please refer to v0.4.0 Release Note
You can refer to the apollo-skywalking-pro sample contributed by @hepyu.
Since version 1.5.0, the Apollo server supports exposing metrics in prometheus format through /prometheus
, such as http://${someIp:somePort}/prometheus