- Published on
Adding observability to a monolithic Go app
- Authors
- Name
- Panagiotis Petridis
Introduction
So you haven't fallen for microservices just yet. However, all the cool kidz are showing off their observability and metrics dashboards with detailed traces that show exactly how long each portion of the request took and if it had any errors. But they aren't really doing that in code, they usually have a kubernetes cluster with a service mesh on top and an entire platform team that makes all the magic happen. In this short article I'll go over how to get some of that sweet observability in your existing monolithic application without writing almost any code.
The Plan
In order to accomplish our goals we will need 3 things, interfaces, proxies and interceptors. The rest of the articles also assumes that you have a basic 3 layer architecture with app/service/repo layers but you should be able to apply the same technique with other architecture. It's also worth noting that although I'll be showing how to do this in Go, you can use the same pattern in other languages (apparently Java folks have been using this for a while).
The plan is to bisect each layer with interfaces. Each layer must connect with another through interfaces which either already exist or are generated by a tool like ifacemaker
. We then create proxies that take the implementation of those interfaces and a set of interceptors as arguments and intercept all the requests made to the implementations. Then the interceptors will run and they can add logging, metrics and traces to each request.
Show me the code
Now that we know what the plan is, it's time to put it in practice. We'll now go over an example on how to do this in a Go monolithic application, let's get started!
The Setup
Let's go over the existing code that you probably have.
There are probably repository interfaces:
type TaskRepository interface {
List(ctx context.Context) ([]*models.Task, error)
Get(ctx context.Context, id int64) (*models.Task, error)
Create(ctx context.Context, task *models.Task) error
Update(ctx context.Context, task *models.Task) error
Delete(ctx context.Context, id int64) error
}
Repository implementations:
type MemoryTaskRepository struct {
tasks []*models.Task
logger *zap.Logger
}
func ProvideMemoryTaskRepository(logger *zap.Logger) repo.TaskRepository {
l := logger.With(
zap.String("repository", "task"),
zap.String("storage", "memory"),
)
return &MemoryTaskRepository{
tasks: []*models.Task{},
logger: l,
}
}
func (r *MemoryTaskRepository) List(ctx context.Context) ([]*models.Task, error) {
return r.tasks, nil
}
// other functions below...
Some service interfaces:
type TaskService interface {
List(ctx context.Context) ([]*models.Task, error)
Get(ctx context.Context, id int64) (*models.Task, error)
Create(ctx context.Context, task *models.Task) error
Update(ctx context.Context, task *models.Task) error
Delete(ctx context.Context, id int64) error
}
Some service implementations (although usually one):
type taskService struct {
repo repo.TaskRepository
logger *zap.Logger
}
func ProvideTaskService(repo repo.TaskRepository, logger *zap.Logger) iservice.TaskService {
l := logger.With(
zap.String("service", "task"),
)
return &taskService{
repo: repo,
logger: l,
}
}
func (s *taskService) List(ctx context.Context) ([]*models.Task, error) {
return s.repo.List(ctx)
}
// other functions...
and the same for app and probable some controllers too - you get the idea.
The Problem and the Solution
The problem we are facing is that there's too many domain services, repositories and app services. How do we add logs and handlers to all of them. Take tracing for example, we need to start and end spans, how do we do that without adding explicit statements at the start and end of every function?
The answer is code generation. We don't have to write almost any of the code, we can generate all of it. First of all we need to generate the interfaces if we don't have them already. You can easily do that by adding a top-level comment with go:generate
that will call ifacemaker to generate all the interfaces. Now that we have the interfaces, we will need some way to intercept all of the requests. For this we can use something like proxygen with yet another go:generate
comment at the top of each file. Proxygen will generate all the proxies for you that you can replace in the Provider methods of each service.
For the example above the generated proxy would look something like this:
type TaskService struct {
Implementation importiserviceTaskService0.TaskService
Interceptors proxygenInterceptors.InterceptorChain
}
var _ importiserviceTaskService0.TaskService = (*TaskService)(nil)
func (this *TaskService) List(
arg0 importiserviceTaskService1.Context,
) (
[]*importiserviceTaskService2.Task,
error,
) {
rets := this.Interceptors.Apply(
[]interface{}{
arg0,
},
"List",
func(args []interface{}) []interface{} {
res0,
res1 := this.Implementation.List(
args[0].(importiserviceTaskService1.Context),
)
return []interface{}{
res0,
res1,
}
},
)
return proxygenCaster.Cast[[]*importiserviceTaskService2.Task](rets[0]),
proxygenCaster.Cast[error](rets[1])
}
// more functions below...
Then you can change the NewTaskService
method to return this instead:
return &proxy.TaskService{
Implementation: &taskService{
repo: repo,
logger: l,
},
Interceptors: interceptor.InterceptorChain{},
}
Now you may be thinking that you'll have to add all these comments and update the providers for every service, but that's still quite a lot of work. Worry not cause I'm just as lazy as you are so I have the solution for that too! When I had to do this, I wrote a quick js script to process all the files parse the names of the services with some regex and then add the comments and update the providers - with enough regex, you can do anything! You can also do some of that with vim macros if you're into that.
The Interceptors
Ok, we've finally reached the fun part. Here you are free to do whatever you want, add tracing spans, logging, metrics whatever you want. For this demo I've added a simple logging interceptor which will allow us to find what caused a nil pointer exception in our application. The interceptor looks like this:
func TracingInterceptor(
logger *zap.Logger,
structName string,
) interceptor.Interceptor {
return func(method string, next interceptor.Handler) interceptor.Handler {
logger := logger.With(
zap.String("method", method),
zap.String("struct", structName),
)
return func(args []interface{}) []interface{} {
var ctx context.Context
ctxIdx := -1
for idx, arg := range args {
if _, ok := arg.(context.Context); ok {
ctx = arg.(context.Context)
ctxIdx = idx
break
}
}
if ctx != nil {
if userID, ok := ctx.Value("UserID").(string); ok {
logger = logger.With(
zap.String("UserID", userID),
)
} else {
logger.Info("no user id")
}
if requestID, ok := ctx.Value("RequestID").(string); ok {
logger = logger.With(
zap.String("RequestID", requestID),
)
} else {
logger.Info("no request id")
}
args[ctxIdx] = util.AddTraceToContext(
ctx,
fmt.Sprintf("%s.%s", structName, method),
)
}
logger.Info("calling method")
return next(args)
}
}
}
// the AddTraceToContext method in the util package
func AddTraceToContext(ctx context.Context, trace string) context.Context {
c, ok := ctx.(*gin.Context)
if !ok {
return ctx
}
stack, ok := c.Value(TraceStackKey).([]string)
if !ok {
stack = []string{}
}
stack = append(stack, trace)
c.Set(TraceStackKey, stack)
return c
}
Note that I'm also using gin and so I'm attaching the stack trace to the gin context but you could have a traceID
on the request and store traces in a global store instead. The options are endless.
Quick Demo
You can find all the code for this demo here. I have intentionally created a nil pointer exception in the repository code. If we had some recover from panic we would end up losing our entire stack trace and it'd be pretty difficult to debug where the error happened. By using the tracing interceptor from earlier in a middleware we can do something like this:
e.Use(gin.CustomRecovery(func(c *gin.Context, err any) {
traceStack := util.GetTraceStack(c)
logger.With(
zap.String("RequestID", c.GetString("RequestID")),
zap.String("UserID", c.GetString("UserID")),
zap.Strings("TraceStack", traceStack),
zap.Any("Error", err),
).Error("Oh no! Anyway...")
}))
The result is that when we run the server and send a request to /tasks/1
to get the first task from an empty slice
we see the following error in the logs:
2023-09-11T21:02:44.283+0100 ERROR cmd/main.go:33 Oh no! Anyway... {"RequestID": "f030dc2c-30d9-441e-a3fe-834d16bea81c", "UserID": "panagiotis", "TraceStack": ["TaskService.Get", "TaskRepository.Get"], "Error": "runtime error: index out of range [1] with length 0"}
It's not very readable now but usually you would parse and export these logs so it'd look nicer in your logging solution. The important thing is that it has a TraceStack
field which tells us exactly what calls were made and in what order. We can see from that, the last thing we called was TaskRepository.Get
so the issue must be somewhere there.
Summary
Although this pattern is pretty common and the example is quite basic, I hope it helped you see the value in generating code for interfaces and proxies. You can use this technique to split your layers and add interceptors for metrics, observability and traces. You barely have to write any code and you get all these features for minimal execution costs (the proxygen
library doesn't use reflection so it's pretty fast). Give it a Go (pun intended) and see if works for you!