Istio 服务网格实战
大约 13 分钟约 3789 字
Istio 服务网格实战
简介
Istio 是目前最流行的服务网格(Service Mesh)实现,通过在应用 Pod 中注入 Sidecar 代理(Envoy),以无侵入的方式实现流量管理、安全加密、可观测性三大核心能力。服务网格将微服务治理逻辑从业务代码中解耦,开发者只需关注业务逻辑,基础设施团队统一管理服务间通信。本文覆盖 Istio 的核心功能实战。
特点
Istio 架构
核心组件
┌──────────────────────────────────────────────────────────────┐
│ Istio 架构概览 │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Control Plane (istiod) │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────────────────┐ │ │
│ │ │ Pilot │ │ Citadel │ │ Galley │ │ │
│ │ │流量管理 │ │证书管理 │ │ 配置验证 │ │ │
│ │ └──────────┘ └──────────┘ └──────────────────────┘ │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ xDS API │
│ ┌────────────┼────────────┐ │
│ ▼ ▼ ▼ │
│ ┌───────────────┐ ┌───────────┐ ┌───────────┐ │
│ │ Pod A │ │ Pod B │ │ Pod C │ Data Plane │
│ │ ┌───────────┐ │ │┌────────┐│ │┌────────┐│ │
│ │ │ App │ │ ││ App ││ ││ App ││ │
│ │ ├───────────┤ │ │├────────┤│ │├────────┤│ │
│ │ │ Envoy │◄┼─┼►│ Envoy │◄┼─┼►│ Envoy ││ │
│ │ │ Sidecar │ │ ││Sidecar││ ││Sidecar ││ │
│ │ └───────────┘ │ │└────────┘│ │└────────┘│ │
│ └───────────────┘ └───────────┘ └───────────┘ │
└──────────────────────────────────────────────────────────────┘安装 Istio
# 1. 下载 Istio
curl -L https://istio.io/downloadIstio | sh -
cd istio-1.21.0
export PATH=$PWD/bin:$PATH
# 2. 检查环境
istioctl x precheck
# 3. 安装 Istio(使用 demo 配置,包含所有组件)
istioctl install --set profile=demo -y
# 4. 验证安装
kubectl get pods -n istio-system
# NAME READY STATUS
# istio-egressgateway-7c8f5bcbdb-abcde 1/1 Running
# istio-ingressgateway-7d9f8bcbdb-fghij 1/1 Running
# istiod-85c8b6f4d5-klmno 1/1 Running
# 5. 查看安装状态
istioctl verify-install自定义安装配置
# istio-install.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
namespace: istio-system
name: istio-controlplane
spec:
profile: production
meshConfig:
accessLogFile: /dev/stdout
accessLogEncoding: JSON
defaultConfig:
tracing:
zipkin:
address: zipkin.istio-system:9411
sampling: 10.0
holdApplicationUntilProxyStarts: true
outboundTrafficPolicy:
mode: REGISTRY_ONLY # 只允许访问注册的服务
components:
pilot:
enabled: true
k8s:
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
hpaSpec:
minReplicas: 2
maxReplicas: 5
ingressGateways:
- name: istio-ingressgateway
enabled: true
k8s:
service:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
name: http
- port: 443
targetPort: 8443
name: https
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 500m
memory: 512Mi
hpaSpec:
minReplicas: 2
maxReplicas: 10
egressGateways:
- name: istio-egressgateway
enabled: true# 使用自定义配置安装
istioctl install -f istio-install.yaml -ySidecar 注入
自动注入
# 为命名空间启用自动注入
kubectl label namespace default istio-injection=enabled
# 验证标签
kubectl get namespace -L istio-injection
# NAME STATUS AGE ISTIO-INJECTION
# default Active 30d enabled
# kube-system Active 30d
# istio-system Active 30d示例应用部署
# deploy/bookinfo/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: productpage
namespace: default
labels:
app: productpage
version: v1
spec:
replicas: 2
selector:
matchLabels:
app: productpage
version: v1
template:
metadata:
labels:
app: productpage
version: v1
annotations:
# Sidecar 注入控制
sidecar.istio.io/inject: "true"
# 资源限制
sidecar.istio.io/proxyCPU: "100m"
sidecar.istio.io/proxyMemory: "128Mi"
spec:
containers:
- name: productpage
image: istio/examples-bookinfo-productpage-v1:1.18.0
ports:
- containerPort: 9080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 200m
memory: 256Mi
readinessProbe:
httpGet:
path: /health
port: 9080
initialDelaySeconds: 5
periodSeconds: 10
---
apiVersion: v1
kind: Service
metadata:
name: productpage
namespace: default
spec:
type: ClusterIP
ports:
- port: 9080
targetPort: 9080
name: http
selector:
app: productpage流量管理
VirtualService 虚拟服务
# traffic-management/virtual-service.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: productpage
namespace: default
spec:
hosts:
- productpage
gateways:
- mesh # 网格内部流量
http:
- match:
- headers:
x-user-type:
exact: premium
route:
- destination:
host: productpage
subset: v2
weight: 100
timeout: 10s
- route:
- destination:
host: productpage
subset: v1
weight: 90
- destination:
host: productpage
subset: v2
weight: 10
retries:
attempts: 3
perTryTimeout: 2s
retryOn: 5xx,reset,connect-failureDestinationRule 目标规则
# traffic-management/destination-rule.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: productpage
namespace: default
spec:
host: productpage
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
connectTimeout: 5s
http:
h2UpgradePolicy: DEFAULT
http1MaxPendingRequests: 100
http2MaxRequests: 100
maxRequestsPerConnection: 2
outlierDetection:
consecutive5xxErrors: 3
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
minHealthPercent: 25
subsets:
- name: v1
labels:
version: v1
trafficPolicy:
connectionPool:
http:
http1MaxPendingRequests: 50
- name: v2
labels:
version: v2Gateway 网关配置
# traffic-management/gateway.yaml
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
name: bookinfo-gateway
namespace: default
spec:
selector:
istio: ingressgateway
servers:
- port:
number: 80
name: http
protocol: HTTP
hosts:
- "bookinfo.example.com"
tls:
httpsRedirect: true
- port:
number: 443
name: https
protocol: HTTPS
hosts:
- "bookinfo.example.com"
tls:
mode: SIMPLE
credentialName: bookinfo-tls-secret
---
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: bookinfo-ingress
namespace: default
spec:
hosts:
- "bookinfo.example.com"
gateways:
- bookinfo-gateway
http:
- match:
- uri:
prefix: /productpage
- uri:
prefix: /static
- uri:
exact: /login
- uri:
exact: /logout
route:
- destination:
host: productpage
port:
number: 9080灰度发布
基于权重的金丝雀发布
# canary/weight-based.yaml
# 阶段一:10% 流量到 v2
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: reviews-canary
namespace: default
spec:
hosts:
- reviews
http:
- route:
- destination:
host: reviews
subset: v1
weight: 90
- destination:
host: reviews
subset: v2
weight: 10
---
# 阶段二:50% 流量到 v2
# 将 v2 weight 改为 50,v1 改为 50
---
# 阶段三:100% 流量到 v2
# 将 v2 weight 改为 100,移除 v1基于内容的灰度发布
# canary/content-based.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: reviews-canary
namespace: default
spec:
hosts:
- reviews
http:
# 内部测试用户走 v2
- match:
- headers:
cookie:
regex: "^(.*?;)?(test-user=true)(;.*)?$"
route:
- destination:
host: reviews
subset: v2
# 移动端用户走 v2
- match:
- headers:
user-agent:
regex: ".*(iPhone|Android).*"
route:
- destination:
host: reviews
subset: v2
# 特定 header 标记的请求走 v2
- match:
- headers:
x-canary:
exact: "true"
route:
- destination:
host: reviews
subset: v2
# 其余流量走 v1
- route:
- destination:
host: reviews
subset: v1
weight: 100自动化灰度发布脚本
#!/bin/bash
# canary-rollout.sh — 灰度发布脚本
set -euo pipefail
NAMESPACE="default"
SERVICE="reviews"
VERSION_NEW="v2"
VERSION_STABLE="v1"
STEPS=(10 25 50 75 100)
WAIT_SECONDS=300 # 每步等待 5 分钟
for weight in "${STEPS[@]}"; do
echo "=== 灰度发布:${VERSION_NEW} 权重 ${weight}% ==="
# 更新 VirtualService
cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: ${SERVICE}-canary
namespace: ${NAMESPACE}
spec:
hosts:
- ${SERVICE}
http:
- route:
- destination:
host: ${SERVICE}
subset: ${VERSION_STABLE}
weight: $((100 - weight))
- destination:
host: ${SERVICE}
subset: ${VERSION_NEW}
weight: ${weight}
EOF
echo "已设置权重,等待 ${WAIT_SECONDS} 秒..."
# 监控错误率
sleep "${WAIT_SECONDS}"
# 检查新版本错误率(示例)
ERROR_RATE=$(kubectl exec -n istio-system deploy/istiod -- \
curl -s 'localhost:15014/stats?filter=cluster.outbound|80||reviews.default.svc.cluster.local' | \
grep -o 'responses_5xx=[0-9]*' | cut -d= -f2)
echo "当前错误率: ${ERROR_RATE}"
if [ "${ERROR_RATE}" -gt 5 ]; then
echo "错误率过高,回滚到 ${VERSION_STABLE}"
cat <<EOF | kubectl apply -f -
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: ${SERVICE}-canary
namespace: ${NAMESPACE}
spec:
hosts:
- ${SERVICE}
http:
- route:
- destination:
host: ${SERVICE}
subset: ${VERSION_STABLE}
weight: 100
EOF
exit 1
fi
done
echo "灰度发布完成:${VERSION_NEW} 承接 100% 流量"熔断与限流
熔断配置
# resilience/circuit-breaker.yaml
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: reviews-circuitbreaker
namespace: default
spec:
host: reviews
trafficPolicy:
connectionPool:
tcp:
maxConnections: 50
connectTimeout: 5s
http:
http1MaxPendingRequests: 30
http2MaxRequests: 50
maxRequestsPerConnection: 5
maxRetries: 2
idleTimeout: 60s
outlierDetection:
# 连续 5xx 错误次数
consecutive5xxErrors: 3
# 检测间隔
interval: 10s
# 驱逐时长(基础)
baseEjectionTime: 30s
# 最大驱逐比例
maxEjectionPercent: 60
# 最小健康实例比例
minHealthPercent: 30
# 连续网关错误次数
consecutiveGatewayErrors: 2限流配置
# resilience/rate-limit.yaml
# 使用 EnvoyFilter 实现全局限流
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: ratelimit-filter
namespace: istio-system
spec:
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
proxy:
proxyVersion: "^1\\.21.*"
patch:
operation: INSERT_BEFORE
value:
name: envoy.filters.http.local_ratelimit
typed_config:
"@type": type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
value:
stat_prefix: http_local_rate_limiter
token_bucket:
max_tokens: 100
tokens_per_fill: 100
fill_interval: 60s
filter_enabled:
default_value:
numerator: 100
denominator: HUNDRED
filter_enforced:
default_value:
numerator: 100
denominator: HUNDRED
response_headers_to_add:
- append_action: OVERWRITE_IF_EXISTS_ADD
header:
key: x-local-rate-limit
value: "true"
status:
code: 429重试与超时
# resilience/retry-timeout.yaml
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: reviews-resilience
namespace: default
spec:
hosts:
- reviews
http:
- route:
- destination:
host: reviews
subset: v1
# 超时配置
timeout: 10s
# 重试配置
retries:
attempts: 3
perTryTimeout: 3s
retryOn: 5xx,reset,connect-failure,refused-stream
retryRemoteLocalities: true
# 故障注入(测试用)
fault:
abort:
percentage:
value: 0.1
httpStatus: 500安全
mTLS 双向加密
# security/mtls.yaml
# 严格模式:只允许 mTLS 流量
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: istio-system
spec:
mtls:
mode: STRICT
---
# 命名空间级别配置
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: default
namespace: default
spec:
mtls:
mode: STRICT
selector:
matchLabels:
app: reviews
portLevelMtls:
9080:
mode: STRICT授权策略
# security/authorization-policy.yaml
# 允许 productpage 访问 reviews
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: reviews-policy
namespace: default
spec:
selector:
matchLabels:
app: reviews
action: ALLOW
rules:
- from:
- source:
principals:
- "cluster.local/ns/default/sa/bookinfo-productpage"
namespaces:
- "default"
to:
- operation:
methods: ["GET"]
paths: ["/api/reviews*"]
when:
- key: request.headers[x-token]
values: ["valid-token-*"]
---
# 拒绝所有外部访问 reviews(deny 优先于 allow)
apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
name: deny-external-reviews
namespace: default
spec:
selector:
matchLabels:
app: reviews
action: DENY
rules:
- from:
- source:
notNamespaces:
- "default"
- "istio-system"
---
# JWT 认证
apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
name: jwt-auth
namespace: default
spec:
selector:
matchLabels:
app: productpage
jwtRules:
- issuer: "auth@example.com"
jwksUri: "https://auth.example.com/.well-known/jwks.json"
audiences:
- "bookinfo"
forwardOriginalToken: true可观测性
Kiali 服务拓扑
# 安装 Kiali(随 Istio demo profile 一起安装)
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.21/samples/addons/kiali.yaml
# 端口转发访问
istioctl dashboard kiali
# 或使用 port-forward
kubectl port-forward -n istio-system svc/kiali 20001:20001
# 访问 http://localhost:20001Prometheus + Grafana 监控
# 安装 Prometheus
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.21/samples/addons/prometheus.yaml
# 安装 Grafana
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.21/samples/addons/grafana.yaml
# 访问 Grafana
istioctl dashboard grafana
# 关键 Istio 指标(PromQL)
# 请求速率
rate(istio_requests_total[5m])
# 错误率
sum(rate(istio_requests_total{response_code=~"5.."}[5m]))
/
sum(rate(istio_requests_total[5m]))
# P99 延迟
histogram_quantile(0.99,
sum(rate(istio_request_duration_milliseconds_bucket[5m]))
by (le, destination_service)
)
# 服务间流量
sum(rate(istio_requests_total[1m]))
by (source_workload, destination_workload)Jaeger 分布式追踪
# 安装 Jaeger
kubectl apply -f https://raw.githubusercontent.com/istio/istio/release-1.21/samples/addons/jaeger.yaml
# 访问 Jaeger
istioctl dashboard jaeger
# 自定义追踪采样率(在 meshConfig 中配置)
# 默认采样率为 1%,生产环境建议 0.1%-1%# observability/tracing-config.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
meshConfig:
enableTracing: true
defaultConfig:
tracing:
sampling: 5.0 # 5% 采样率
zipkin:
address: jaeger-collector.istio-system:9411
max_path_tag_length: 256自定义指标(Telemetry API)
# observability/telemetry.yaml
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: default-tracing
namespace: istio-system
spec:
tracing:
- providers:
- name: jaeger
randomSamplingPercentage: 5.0
customTags:
user_id:
header:
name: x-user-id
---
apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: custom-metrics
namespace: default
spec:
metrics:
- providers:
- name: prometheus
overrides:
- tag:
request_method: request.method
request_path: request.path
dimensions:
request_host: request.host
metrics:
- name: REQUEST_COUNT
tags:
request_method: request.method性能调优
Sidecar 资源配置
# performance/sidecar-resources.yaml
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
name: default-sidecar
namespace: default
spec:
egress:
- hosts:
- "default/*"
- "istio-system/*"
outboundTrafficPolicy:
mode: REGISTRY_ONLY
---
# 限制特定服务的 Sidecar 出站范围
apiVersion: networking.istio.io/v1beta1
kind: Sidecar
metadata:
name: productpage-sidecar
namespace: default
spec:
workloadSelector:
labels:
app: productpage
egress:
- hosts:
- "default/reviews"
- "default/details"
- "default/ratings"Istiod 性能优化
# performance/istiod-optimization.yaml
apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
spec:
meshConfig:
defaultConfig:
proxyStatsMatcher:
inclusionRegexps:
- "cluster\\..*"
inclusionSuffixes:
- "downstream_rq"
- "upstream_rq"
# 降低配置推送频率
discoverySelectors:
- matchLabels:
istio-discovery: enabled
values:
pilot:
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 2Gi
# 增量推送
enableEDSDebounce: true
global:
# 减少 Envoy 和 istiod 的连接
proxy:
resources:
requests:
cpu: 50m
memory: 64Mi
limits:
cpu: 200m
memory: 256Mi故障排查
常用诊断命令
# 分析整个网格配置
istioctl analyze -A
# 分析特定命名空间
istioctl analyze -n default
# 检查代理状态
istioctl proxy-status
# 查看代理配置
istioctl proxy-config route deploy/productpage -n default
istioctl proxy-config cluster deploy/productpage -n default
istioctl proxy-config endpoint deploy/productpage -n default
istioctl proxy-config listener deploy/productpage -n default
# 查看代理日志
kubectl logs deploy/productpage -c istio-proxy -n default
# 启用调试日志
istioctl pc logging deploy/productpage --level debug
# 检查证书
istioctl pc secrets deploy/productpage -n default
# 验证配置
istioctl validate -f virtual-service.yaml常见问题排查流程
#!/bin/bash
# troubleshoot.sh — Istio 故障排查脚本
NAMESPACE="default"
SERVICE="productpage"
POD=$(kubectl get pods -n ${NAMESPACE} -l app=${SERVICE} -o jsonpath='{.items[0].metadata.name}')
echo "=== Pod 信息 ==="
kubectl describe pod ${POD} -n ${NAMESPACE} | grep -A5 "Containers:"
echo ""
echo "=== Sidecar 注入状态 ==="
kubectl get pod ${POD} -n ${NAMESPACE} -o jsonpath='{.spec.containers[*].name}'
echo ""
echo "=== 代理同步状态 ==="
istioctl proxy-status ${POD} -n ${NAMESPACE}
echo ""
echo "=== 代理路由配置 ==="
istioctl proxy-config route ${POD} -n ${NAMESPACE} -o json | head -50
echo ""
echo "=== 代理集群配置 ==="
istioctl proxy-config cluster ${POD} -n ${NAMESPACE}
echo ""
echo "=== 最近错误日志 ==="
kubectl logs ${POD} -c istio-proxy -n ${NAMESPACE} --tail=20 | grep -i error
echo ""
echo "=== 配置分析 ==="
istioctl analyze -n ${NAMESPACE}Istio vs Linkerd
对比表
| 维度 | Istio | Linkerd |
|---|---|---|
| 代理 | Envoy(C++,功能丰富) | linkerd2-proxy(Rust,轻量) |
| 资源占用 | Sidecar ~150MB 内存 | Sidecar ~20MB 内存 |
| 功能丰富度 | 非常丰富(流量、安全、可观测) | 核心功能齐全 |
| 学习曲线 | 陡峭 | 平缓 |
| 社区活跃度 | CNCF 毕业项目,社区最大 | CNCF 毕业项目 |
| 配置复杂度 | 较高(CRD 多) | 简单直观 |
| 适用规模 | 大规模(1000+ 服务) | 中小规模 |
优点
- 无侵入:业务代码无需修改,Sidecar 自动注入
- 流量管理精细:支持权重、内容、Header 等多维度路由
- 安全加固:mTLS 加密、RBAC 授权、JWT 认证一体化
- 可观测性强:Kiali 拓扑、Jaeger 追踪、Prometheus 指标开箱即用
- 灰度发布:基于权重和内容的精细化灰度策略
- 服务韧性:熔断、重试、超时、限流内置支持
缺点
- 资源开销:每个 Pod 增加 Sidecar 容器,内存和 CPU 开销
- 学习曲线陡峭:VirtualService、DestinationRule 等概念较多
- 调试困难:问题可能出在应用层、Sidecar 层或 Istiod 层
- 性能损耗:Sidecar 代理增加请求延迟(通常 1-5ms)
- 版本迭代快:API 和配置格式变化较快
- 运维复杂:升级 Istio 版本需要谨慎操作
性能注意事项
- Sidecar 资源限制:为 Envoy 设置合理的 CPU/Memory limit
- Sidecar 范围控制:使用 Sidecar CRD 限制出站服务发现范围
- 采样率调优:追踪采样率不宜过高,建议 0.1%-5%
- 增量推送:启用 EDS Debounce 减少 Pilot 推送频率
- 连接池配置:根据服务负载设置合适的连接池参数
- 日志级别:生产环境使用 warning 级别,避免 debug 级别
总结
Istio 作为最流行的服务网格方案,通过 Sidecar 模式实现了流量管理、安全通信、可观测性三大能力的无侵入集成。在微服务架构中,Istio 可以显著降低服务治理的复杂度。但需要注意其资源开销和学习成本,对于小规模项目可以先使用基础的流量管理功能,逐步引入高级特性。
关键知识点
- Istio 由控制面(istiod)和数据面(Envoy Sidecar)组成
- VirtualService 定义路由规则,DestinationRule 定义目标策略和子集
- mTLS 通过 PeerAuthentication 配置,授权通过 AuthorizationPolicy 配置
- 灰度发布支持基于权重和基于内容两种模式
- 熔断通过 DestinationRule 的 outlierDetection 配置
- istioctl analyze 是排查配置问题的首选工具
常见误区
- Istio 解决所有微服务问题 — Istio 治理的是服务间通信,业务逻辑问题仍需自己处理
- 自动注入等于零配置 — 仍需配置 VirtualService 和 DestinationRule 才能使用高级功能
- mTLS 默认开启 — Istio 1.15+ 默认 PERMISSIVE 模式,需手动改为 STRICT
- Sidecar 不消耗资源 — 每个 Sidecar 约增加 50-150MB 内存和 1-5ms 延迟
- 灰度发布只需改权重 — 还需要关注新版本的健康检查和回滚策略
- Istio 完全替代 API Gateway — Istio 适合东西向流量,南北向仍需 API Gateway
进阶路线
- 入门:安装 Istio、部署示例应用、理解 Sidecar 注入
- 进阶:VirtualService 路由、灰度发布、mTLS 配置
- 高级:熔断限流、授权策略、自定义指标、Wasm 插件
- 专家:多集群部署、EnvoyFilter 自定义、性能调优
- 架构:服务网格治理平台、多租户隔离、跨集群联邦
适用场景
- 微服务数量较多的系统(10+ 服务)
- 需要精细流量管理的场景(灰度、A/B 测试)
- 对服务间安全有严格要求(金融、医疗)
- 需要全链路追踪和可观测性的系统
- 多语言微服务架构(无侵入特性特别适合)
落地建议
- 分阶段引入:先上流量管理,再加安全,最后完善可观测性
- 使用 demo profile 开发:开发环境使用 demo profile,生产环境使用 production profile
- 设置合理的 Sidecar 资源:根据实际负载调整 CPU 和 Memory
- 建立灰度发布 SOP:制定灰度发布的标准流程和回滚预案
- 监控 Istio 自身:使用 Prometheus 监控 istiod 和 Sidecar 的健康状态
- 定期版本升级:跟随社区版本迭代,及时获取安全补丁
排错清单
复盘问题
- 为什么 Istio 的 Sidecar 模式比 SDK 模式更适合多语言微服务?
- Envoy 是如何通过 xDS API 从 istiod 获取配置的?
- 灰度发布中如何保证同一个用户的请求始终路由到新版本?
- mTLS 的证书轮换是如何实现的?对业务有什么影响?
- 如何在 Istio 中实现跨集群的服务发现和通信?
- Wasm 插件机制如何扩展 Istio 的能力边界?
