Wednesday, November 9, 2016

How Upwork Takes an Engineering-driven Approach to Solving Problems

White stairs going upward, with an arrow pointing the way on the top step.

The engineers at Upwork are closest to the systems that impact our most precious asset: our customers. As such, our engineers are in a position to influence and improve the inner-workings of our site to make it better, faster, and more stable—so freelancers and clients can get more done.

Because of their proximity to the customer and the site, engineers can be valuable contributors to more than just code—they can provide insight and perspective when it comes to problem-solving on a bigger scale. Following is a look at a few valuable lessons we’ve learned about shifting certain decision-making responsibilities to the engineering team, a move that’s yielded significant improvements to the stability and performance of our legacy platform, and a better experience for the users who rely on our site every day.

Why Make Changes to a Retiring Legacy System?

The oDesk Remote Procedure Call (oDesk RPC, or ORPC) team at Upwork maintains the legacy oDesk backend, written in Perl. oDesk RPC initially described our own custom binary RPC protocol for remote calls between the backend and consumers.

Although this backend subsystem will be discontinued in the mid-term, in favor of a modernized microservice platform, it still handles all of the core business entities of our domain model (e.g., users, job posts, job applications, offers, contracts, and time tracking information). Because of this, we felt while it is still being actively used, it should bring no surprises. It was important to increase the stability, performance, and predictability of this legacy back-end subsystem.

Planning the Work & Working the Plan

At the beginning of Q2 2016, we decided to switch our planning responsibility from the Product Team over to the Engineering Team. Fortunately, we had some space in our sprints to make room for this experiment.

To start, we created a Google Doc and asked each developer on the team to provide three items they felt were worth addressing. After collecting everyone’s input, we discussed then finalized our action items.

Every developer received the chance to work on items they personally considered important, which had a very positive impact on team performance and motivation. All scheduled tasks were completed on time and scope as planned, including:

  • Fully refactored email rendering in the legacy backend. This resulted in a 2x to 5x speedup of the rendering process, and removed network dependency for rendering.
  • Implemented a fully asynchronous email sending. This effort yielded a 5x increase in speed of email sending calls, loosening external dependencies (Mailgun/SMTP), and thus increasing overall stability of our application server.
  • Enabled modernized time tracking synchronization on our production servers. This resulted in 10x speedup of the process and much more frequent updates of all time tracking reports.
  • Improved the PostgreSQL connection pool cleanup. The total number of active connections at any moment now does not exceed 3-4 for all legacy application servers.
  • Improved application logging to make problem detection in production easier. We did a huge cleanup of reported warnings, which reduced log size by a magnitude. This was beneficial both in terms of logging infrastructure utilization and troubleshooting.

Driving Platform Issues Down to Zero/Month

This process not only enabled our engineering team to make quick and efficient system improvements, it also established a pragmatic and inexpensive approach to improving the quality of our product.

We repeated this process in Q3, only on a lesser scale because our product-related workload became significantly higher. By the end of Q3, the number of production incidents (outages) due to legacy platform issues reduced to zero per month.

Optimizing for Success

When working in an environment that’s highly technological, it’s important to iterate, adapt, and learn better ways to approach future projects along the way. In this case, the two major lessons learned were:

  • Listen to the engineering team’s opinion and understand the significance behind each opinion. Good engineers often have a huge amount of insight into more intricate details of software—aspects that aren’t necessarily visible at a management/product level.
  • Trust the judgement of your engineers, and provide all necessary support for them to execute what they think is worth executing. This lets them do their job to the most efficient degree. Engineers often need some kind of political support from upper management to accomplish what they need to accomplish, so it’s helpful to ensure that up front.

Ultimately, every professional in software development seeks to contribute something impactful to their organization. By shifting certain decisions to the engineering team, Upwork Engineering was able to make quick and efficient improvements to our system, establishing a more pragmatic (and less expensive) approach to improving the quality of our product.

For engineers on the whole, this shift creates an opportunity for them to do what they love doing. For businesses, this improves a company’s product in the long run, helping to satisfy customers and maintain growth.

The post How Upwork Takes an Engineering-driven Approach to Solving Problems appeared first on Upwork Blog.



from Upwork Blog https://www.upwork.com/blog/2016/11/upwork-engineering-driven-approach-solving-problems/

No comments:

Post a Comment