Selectel

How I scaled a legacy infrastructure management interface

Overview

Selectel is a leading cloud infrastructure and data center provider. One of its core products — Cloud Servers (Virtual Machines) — was originally designed for small infrastructures but gradually evolved into a critical enterprise-level product without being structurally redesigned.

When a major enterprise client reported severe usability and performance issues while managing 50–100 servers, it became clear that the existing interface no longer scaled — neither technically nor cognitively.

As the sole UX designer, I led the redesign of the servers panel from discovery to release, focusing on improving infrastructure health assessment, speed of interaction, and scalability, while preserving usability for small clients.

Problem

For enterprise clients managing large infrastructures:

  • The server list loaded slowly and blocked work.
  • Identifying a specific server required excessive scrolling.
  • Key actions (health check, repair, connection) were hard to access.

This resulted in:

  • Decreased user satisfaction,
  • Risk of churn among high-value customers.

At the same time, small clients still relied on the existing mental model — so a full redesign risked alienating them.

My Role

Sole UX Designer in an agile team (PO, QA, 2 developers). I owned:

  • Discovery and research
  • UX strategy and concept
  • Prototyping and usability testing
  • Interaction and information architecture
  • Design delivery, onboarding, and success metrics

Discovery & Research

Users Segments

There are two major types of users. The interface was clearly optimized for group #2, while failing group #1.

Session Recordings & Web Analytics

I started with session replays and heatmaps to quickly understand real user behavior.

Deep Product & Technical Context

Cloud servers are tightly connected to disks, networks, images, clusters — each owned by different teams.

To understand constraints and opportunities, I conducted interviews with backend & frontend engineers, as well as discussions with support teams (as pain-point carriers). This helped uncover:

  • legacy technical limitations,
  • high-risk interactions,
  • backend-dependent UX decisions.

Competitive Analysis

Enterprise clients typically use multiple cloud providers simultaneously to diversify risk. This made competitor analysis especially important:

➞ common patterns reduce switching costs,
➞ familiar interactions increase trust.

I analyzed how competitors handle large server lists, key attributes visibility, bulk actions, health assessment.

User Interviews

I interviewed 5 system administrators managing 10+ servers per location.

As a result, I revealed 5 Key Jobs to Be Done (ranked by importance). During the interviews users also rated task difficulty and time required in the existing UI.

The most critical tasks were the least efficient.

Design Goals

  1. Improve usability of critical enterprise tasks.
  2. Enable fast infrastructure scanning.
  3. Reduce perceived and actual latency.
  4. Remove legacy constraints where possible.
  5. Preserve usability for small clients.

Success Metrics

  • Task completion time
  • Qualitative satisfaction feedback
  • Retention of small clients

Ideation & Prototyping

I explored multiple structural approaches: compact layouts, bulk interactions, customizable tables, aggregated health widgets, etc. — and discussed them with my team. Most of the ideas were left behind due to the limited backend scope.

After prioritizing features, we decided to focus the first iteration on changes to the server list. Here, I had to balance the needs of two different user types: enterprise users, who are interested in density and speed, and small users, who value familiarity and simplicity.

Usability Testing & Iteration

I tested early prototypes with internal engineers (as proxy users).

As a result, most tasks became faster, with the exception of one critical one: server recovery took longer due to the boot disk information being buried too deeply. I iterated on the structure and retested until all key tasks improved.

Final Solution

Table instead of cards

Cards were replaced with a tabular layout, which accelerated infrastructure health assessment and instance searches due to the higher information density. Furthermore, this solution scales more for large infrastructures.

Faster Health Assessment

From research, server repair follows a mental checklist: check an operating system  (1) and type of a bootable disk (2), and then connect to the server’s console (3). The UI was optimized to support this sequence with minimal navigation.

Visual Hierarchy for Server Identification

IP address often reflects a server’s role in the infrastructure. So, my teammate and I came up with the unique indication system:

  • public IP — larger size and black color,
  • private IP — smaller size and grey color.

This helped to show addresses in one column and leave it possible to navigate through different types of IPs. Moreover, a public server (the main player in the infrastructure) became more visible.

Dedicated Server Page with the initial structure

A separate server page enables working with multiple servers in parallel, which was impossible in the old UI.

To reduce cognitive load and speed delivery, the first version reused the initial card structure with one key change: the console, which was used most often, was moved to the last tab (instead of being located in the middle). Placing it on the first tab would have degraded perceived performance due to the heavy loading time, while the last tab is the most quickly accessible.

Onboarding & Feedback Loop

For those who had had servers at the release date, I created a set of tips. They covered only the typical user tasks to not overwhelm a user and pointed to a certain part of the interface to ease acquaintance.

Furthermore, a feedback entry point was added directly in the interface.

Delivery & Support

Deliverables included:

  • interactive prototypes
  • UI copy & translations
  • updated user documentation
  • analytics events & metrics
  • training sessions for customer support

Results & Impact

  • Task completion time decreased by 17% in average;
  • Reduced scrolling;
  • Faster time-to-first-action due to async loading;
  • Positive feedback via support tickets and direct channels;
  • Research insights formed a long-term UX backlog.

The findings became the foundation for redesigning the Volumes Panel, where I later led a full structural overhaul.

Reflection & Learnings

Looking back, I would push harder for an aggregated infrastructure health widget.

Although it required significant backend work, it could have:

  • reduced cognitive load dramatically,
  • differentiated Selectel from competitors,
  • addressed a core enterprise pain point.

Another learning: 5 interviews were enough to uncover major issues but more quantitative validation would strengthen confidence.

Most importantly, shipping quickly allowed us to learn faster from real usage and iterate based on actual behaviour.

Key Takeaway

This project taught me how to:

  • redesign legacy systems under real constraints,
  • balance enterprise and SMB needs,
  • work deeply across design, engineering, and support,
  • deliver measurable impact without “perfect” conditions.