Archive for the ‘How HipChat Works’ Category

Emoticons: saying nothing, but also everything

HipChat Emoticons

Let’s just be honest, emoticons are awesome. They’re probably one of the best parts about working with a variety of personalities.

Emoticons say so much without a word, but convey attitude, emotion, or frame of mind with a photo. Around here at HipChat HQ, we’re guilty of having entire conversations with emoticons. If aliens discovered how the HipChat team messaged one another, they’d think we use modern day hieroglyphics.

Our emoticon shrink ray resizes almost any image you find on the web into that 30 x 30 image you love. Because of this, we have so many inside jokes. It’s bananas how fast what’s trending in our social room can become an emoticon in minutes due to people thinking it’s hilarious or poignant.

From Left Shark, to one of our recruiter’s dogs looking like the cutest Ewok ever, we’re addicted to emoticons. It’s not just us, teams everywhere love them some Rage Comic faces, or the Sad Panda. Best part? They’re all predictive – so just type in the first letter and each one of these emoticons will pop up alphabetically.

Want to create your own? Team admins can add up to 100 custom emoticons. (Remember, with great power comes great responsibility.)

Check out some of our favorites below.

Funny

Sometimes it’s easier to drop an emoticon in when someone’s being hilarious. It works better than trying to put into words. We rely so much on internet humor around here. HipChat emoticon yey

HipChat Emoticons      ALL THE THINGS (allthethings)

HipChat Custom Emoticons          AWW YISS (awwyiss)

HipChat Custom Emoticons       DISAPPEAR (disappear)

HipChat Custom Emoticons       HUGE FAN  (hugefan)

HipChat Custom Emoticons       YO DAWG (yodawg)

Trolling

Need to bust some chops? These emoticons are great for calling someone out, letting them know you’re very much paying attention to what they’re doing and saying. Cause sometimes, when someone says something dumb, you need to drop a  HipChat emoticon troll on them.

HipChat Custom Emoticon Troll        TROLL (troll)

HipChat emoticon dumb        DUMB (dumb)

HipChat Emoticon downvote       DOWNVOTE (downvote)

HipChat emoticon lolwhut         LOL WUT (lolwut)

HipChat emoticon orly        ORLY (orly)

HipChat emoticon unacceptable         UNACCEPTABLE (unacceptable)

HipChat emoticon youdontsay       YOU DON’T SAY (youdontsay)

Congrats/good job/nice work/thanks

Is someone crushing it at work? It’s always cool to give them a nod of approval, a wink that people notice their work. If your girl in marketing is dominating or your dude in sales is closing the deals, drop them an (awyeah) , they’ll appreciate it.

HipChat emoticon notbad          NOT BAD (notbad)

HipChat emoticon you got it dude   YOU GOT IT DUDE (yougotitdude)

HipChat emoticon true story        TRUE STORY (truestory)

HipChat emoticon me gusta         ME GUSTA (megusta)

HipChat emoticon indeed       INDEED (indeed)

HipChat emoticon aw yeah       AW YEAH (awyeah)

Animated

Need a little something different? Try one of the animated gifs to add some flavor to your conversation.

HipChat emoticon fireworks         FIREWORKS (fireworks)

HipChat emoticon boom       BOOM (boom)

HipChat emoticon mind blown        MIND BLOWN (mindblown)

HipChat emoticon rock on       ROCK ON (rockon)

HipChat emoticon why not both     WHY NOT BOTH (whynotboth)

Dislike

What happens when stuff gets weird? Or the conversation takes a turn south? An emoticon is a great way to lessen the tension, or let someone know that it might be time to switch lanes of thought. Best part? You never said a word.

HipChat emoticon do not want       DO NOT WANT (donotwant)

HipChat emoticon Sad Panda        SAD PANDA (sadpanda)

HipChat emoticon sad indeed       SAD INDEED (sadindeed)

HipChat emoticon sad cowboy        SAD COWBOY (sadcowboy)

HipChat emoticon sad apple         SAD APPLE (sadapple)

HipChat emoticon oh god why        OH GOD WHY (ohgodwhy)

Random

What about those times when you really have nothing to add to a conversation, but figure why not drop in a kudos or a simple, “sure, that’s cool.” These are perfect.

HipChat emoticon clarence       CLARENCE (clarence)

HipChat emoticon face palm        FACE PALM (facepalm)

HipChat emoticon nice       NICE (nice)

HipChat emoticon no idea         NO IDEA (noidea)

HipChat emoticon salute       SALUTE (salute)

HipChat emoticon scumbag        SCUMBAG (scumbag)

HipChat emoticon wat        WAT (wat)

HipChat emoticon whoa       WHOA (whoa)

Most Used

A few of these were mentioned above, but some of them are too good not to repeat. Here are just a few of the emotions we overuse every single day.

HipChat emoticon lol         LOL (lol)

HipChat emoticon do not want       DO NOT WANT (donotwant)

HipChat emoticon disappear       DISAPPEAR (disappear)

HipChat emoticon true story       TRUE STORY (truestory)

HipChat emoticon waiting       WAITING (waiting)

HipChat emoticon not bad         NOT BAD (notbad)

Learn more about emoticons here.

Coming soon to every team: The emoticon shrink ray – create and upload custom emoticons with a simple click. Trust us, it’s awesome.

Work Faster with Slash Commands

We built HipChat for instant communication. HipChat slash commands are a set of time-saving tricks to get the most out of HipChat without reaching for your mouse.

All HipChat slash commands take the form: /Command + Text

  • Command is the action you want to take
  • Text is the new message you want to appear

Slash Commands for Room Actions

Slash Commands on HipChat

Slash Commands for Status Control

You can enter these commands in any chat window to change your availability.

HipChat Slash Commands

Slash Commands for Message Formatting

Keep your team members on their toes by rendering your messages in style. Or, fix mistakes you made in your last message by replacing words (ah, the relief!).

HipChat Slash Commands

Code Syntax Highlighting (/code):

Code formatting and syntax highlighting on HipChat

Quote Command (/quote):

HipChat Slash Commands

Fix Spelling Mistakes (s/):

HipChat Slash Commands

Emote Command (/me):

HipChat Slash Commands

Note: If you try typing an unknown slash command, it will just send like a normal message. Otherwise, no message gets sent and the slash command is executed.

Android Users: The following do not work on the Android app: /join, /topic, /part, /available, /away, /dnd, /clear, and s/.

Slash commands help make work communication simpler and less stressful. With these tools, you can get to your chats faster, style your text, and fix spelling mistakes. Afterall, it’s your chat, and it should work the way you want it to.

June 15th Outage

What happened and what we’re doing about it

We sincerely apologize for our recent outage: you trust us with your chats, your important documents, your cat gifs, your personal conversations, your system notifications, your internet memes – and we let you down.

Our team takes pride in making   an important part of your (work) life, and we’re sorry. sad face hipchat

What happened?

Short version: the recent Mac client release which had the much anticipated “multiple account” feature also had a subtle reconnection bug that only manifested under very high load  facepalm-hipchat.

When a large network provider in the SF Bay Area had an issue Monday morning, it caused all of those clients to start reconnecting at once. This saturated our systems and prevented normal usage.

On Monday, we released an update to our backend systems, and Tuesday morning we released a new Mac app (v 3.3.1), both of which increased protection against this type of issue in the future.

We also fixed various other bugs related to reconnection in the Mac app that will prevent another connection overload like this one. And we continue to have teams building new, amazing technology that improves our system isolation, enhances our ability to do sophisticated load testing, supports even higher scale, and increases our server-side capability management (to disable misbehaving client functionality more directly, for example).

We never want you to be without HipChat. We fell far short of that Monday and are very sorry we let you and your teams down. 

We just passed 6B messages delivered via HipChat, more than 2B of which have been delivered in 2015 – our platform is scaling and growing faster than ever thanks to teams like yours.  We’re moving quickly to build a stronger HipChat as a result of the experience – thanks for your patience while we do.

 

HipChat and the little connection that could

Three weeks ago, we introduced HipChat’s brand new, badass web client. It’s fast, beautiful and built to change how people connect. Needless to say, we’re incredibly proud of it. But, as much as we wanted a perfect launch, we weren’t so lucky: if you tried to use the client in the first week or two, you might have noticed a few hiccups.

Sorry about that.

In the spirit of Open Company – No Bullshit we wanted keep our users informed about the recent outages and what we did to fix the issue. Some of those outages degraded other areas of HipChat, like slowing our main website and message delivery. We’ve made moves to strengthen our web client’s stability so these issues never happen again.

How connecting to the HipChat web client works, at 10,000 feet

  1. You log into www.hipchat.com, creating a session with HipChat’s web layer.
  2. After logging in, you click Launch the web app which, in the web layer, creates a session with our BOSH Server.
  3. Once connected, our BOSH server in turn creates a session with our XMPP server.

In this chain, our BOSH Server is the weakest link. It wasn’t standing up to the popularity of the new client. And unfortunately, it’s coupled to our main web tier in a really bad way.

As our BOSH server came under pressure, it triggered a large number of sessions to reconnect. This, coupled with other issues, would cause hipchat.com to degrade. This is what happened the last week of March.

The little connection that could

With the new web client, the goal was to improve client reconnection, allowing HipChat to maintain resiliency toward network changes, roaming, outages, etc.

Previously, HipChat’s web client attempted reconnection every 10 – 30 seconds following a disconnection. This time around, we wanted a better experience: reconnecting as “automatically” as possible, hoping users never noticed a thing.

To do this, we decreased the connection retry from 10-30 seconds, down to 2 seconds. This drastically shortened time, combined with a surge of new users, strained our system. When we re-wrote the hipchat-js-client, we tried to ensure our users we had reasonable polling rates with exponential back-off and eventual timeout.

Here’s what the new reconnect model looked like:

webclient reconnect

The initial reconnection attempts were too aggressive for the amount of traffic we saw. So, our first action was to quickly update the back-off rate and initial poll time to be more reasonable.

The problem with exponential back-off

As always, things get complicated when we consider this at scale (webscale). Let’s say a large number of clients become disconnected at once due to a BOSH node failure. With our current reconnection model, we saw the following traffic pattern:

backoff_expo_ts

(Above example from AWS Blog, not actually pulled from HipChat, but you get the idea.)

Well, that’s not that much more awesome.

We’ve effectively just bunched all the reconnection requests into a series of incredibly high-load windows where all of the clients compete with each other. What we really want is more randomness. We implemented a heavily jittered algorithm design.  This gives us the benefit of having the least number of competing clients, and encourages the clients to back off over time.

waitTime = min(MAX_WAIT, random_integer_between(MIN_WAIT, lastComputedWaitTime * BACKOFF_RATE))

backoff_fj_ts

(Again, this example from AWS Blog. They have prettier graphs.)

This model has had a huge impact, and made the service much more resilient.

Untangling the Gordian knot

As mentioned, our BOSH server and our web tier are unfortunately coupled. Currently, it’s the web tier’s job to attach a pre-authed BOSH session to new clients. We do a lot of nginx hackery to ensure that your web session and your BOSH session are live, and are routed to the same box. This means anytime a web client reconnects, it hammers on its corresponding web box making both unstable.  This also makes scaling our BOSH server really tricky. And worse, it prevents service isolation since we shared a lot of resources between our web site and HipChat’s web client.

As of March 26th, we’ve deployed changes that allow our web sessions and BOSH sessions to be uncoupled. In fact, all of our new web client users are already using this new auth method. This means we can scale our main website and our web client independently. We’ve already set up isolated worker pools for each. Together, these changes should ensure a misbehaving web client doesn’t cause a dead hipchat.com.

Double the trouble, double the fun

Since we knew session acquisition was our biggest pain point, we combed through our connection code, looking for ways to make it less expensive. We noticed that it was double-hitting Redis in some cases. A fix was quickly deployed, and the results?

double-query-redis

They speak for themselves.

How’s it looking?

Since we made these changes, distribution of load on our system has been much improved. In the graphs below, the white lines show the start of Friday 3/27.

last 4

Four days of traffic prior to change (Tue – Fri)


last 14

Preceding two weeks of traffic (Mon – Fri, Mon – Fri), notice/compare Fridays (end user platform use level is approximately the same).

Many thanks

We’ve got a long list of stability and performance fixes in the pipeline to keep up with amazing growth in demand for HipChat. Thanks for your patience and support. (heart) (hipchat).

(goodnews) New and improved emoticons

7 months ago Matt McDaniel 22 Comments

Here on the HipChat team, we take emoticons seriously. It’s really weird to spend a healthy portion of your work day discussing the finer points on tiny pictures from the internet. But when we see how happy those tiny pictures make everyone, it’s all worth it. Today, we want to tell you about a few ways we’ve improved emoticons in HipChat.

Emoticons for (allthescreens)

All of our global emoticons are now built to support hi-res displays, like Apple Retina displays and the higher end of the Android pixel density spectrum. This change resolved the much voted-on UserVoice ticket that we’ve had on our list for a while.

For every new image, we created hi-res sizes up to 4x (because you never know what the future will bring), so they’ll look good on both your fancy high-res displays and that old monitor your work won’t upgrade for you.  We’ve been rolling out these Retina emoticons over the last few weeks, and we’ll continue to tweak them every so often. Check out the big changelog below for details.

Sharing our favorites

Included in this latest release are a few of our favorite emoticons. Many, like (whoa) Keanu, were added because of your feedback. Others, like (disappear) and (salute), are ones we use so much around Atlassian that it didn’t feel right to keep them for ourselves.

  

Don’t forget that you can upload your very own custom emoticons just for your team. So next time you don’t have to wait for us to update the global set with (whoa) and can BYOK (Bring Your Own Keanu).

One size fits most

In addition to Retina-fying our global set, we improved our emoticon uploader. Now, admins can upload new, hi-res emoticons. And even better, we’ll automatically scale them for you. We suggest starting with an image around 120px (the new maximum). After the image goes through some emoticon uploader magic, you’ll end up with an effective 30px image after scaling.

We debated how to handle retina images. We chose this auto-scaling method because it’s far easier to upload new emoticons, or update an existing one, when you only have to create and maintain one size.

Protip: Keep in mind that whatever you upload is going to be scaled down 4 times, so a 1px stroke is going to be hard to see on a low-res monitor. We learned the hard way how much that can really effect line-drawing emoticons. That’s why you’ll see “beefed up stroke width” on a lot of the changelog below. It also helps if your image starts with dimensions that are multiples of 4 on both sides. For example: 116 px wide is great (116/4=29), but 115 px wide isn’t (115/4=28.75). 

And now that you can’t stand to look at your old custom emoticons on your Retina monitor, you can re-upload any of your existing custom emoticons by deleting old ones and uploading a new, larger versions. We’ll continue to refine and add to our global emoticon set.

The other “emoticons”

We also updated our set of little faces. Similar to the situation with the Retina emoticons above, we needed higher resolution images than the current assets allowed. We also wanted a new set of our own.

We started looking at emoji sets for inspiration and pulled in little touches from our HipChat brand, like using the logo smile shape as often as possible, and we ended up with a set that we really like. As with everything else, we’ll be iterating on them here and there, and someday may bring full emoji support.

Oh, and the icy blue corpse thumb thing we did is dead and buried. Dig it out if you want: (corpsethumb). We’re replacing the re-colored Apple emoji thumbs-up and thumbs-down with a new pair of icons that don’t look at all like Facebook. (badpokerface)

Check out hipchat.com/emoticons to see all the new emoticons and try them out with your team! And keep the feedback coming at help.hipchat.com.

Changelog

Changes made to global emoticons

  • (awyeah): beefed up stroke width
  • (badass): beefed up stroke width
  • (badjokeeel): added contrast and adjusted size
  • (bumble): added contrast
  • (caruso): adjusted size
  • (challengeaccepted): beefed up stroke width
  • (chewie): added contrast, brightened
  • (chucknorris): brightened up to bring out the Chuck Norris
  • (clarence): added contrast, brightened
  • (derp): beefed up stroke width
  • (dumb): enlarged to show face better
  • (facepalm): added contrast, tightened crop to improve chest to face and palm ratio
  • (fonzie): added contrast
  • (freddie): beefed up stroke width
  • (gangnamstyle): cleaned up gif
  • (gates): adjusted a little to match (jobs)
  • (goodnews): scaled up a little
  • (haveaseat): added contrast
  • (heart): updated to match smilies, adjusted crop
  • (ilied): lightened up to bring out details
  • (indeed): cropped to just face and highlighted monocle
  • (itsatrap): added contrast, brightened up to bring out detail
  • (jackie): beefed up stroke width
  • (jobs): adjusted to match (gates), added contrast and brightened
  • (joffrey): new image, fewer spoilers, more like the others (not shown above)
  • (kennypowers): added contrast and brightened
  • (krang): beefed up stroke width, reduced contrast between colors
  • (lincoln): brightened to blow out noise
  • (lolwut): added contrast and brightened, scaled up
  • (notbad): beefed up stroke width
  • (notsureif): replaced with (fry) to match
  • (philosoraptor): brightened to bring out detail
  • (present): added contrast and brightened
  • (reddit): removed white fill from antenna shape
  • (romney): added contrast and brightened
  • (sadpanda): using a sadder, more forward-facing panda
  • (samuel): added contrast and levels adjusted to stand out more on white
  • (skyrim): added contrast and brightened
  • (sweetjesus): crammed into (iseewhatyoudidthere)’s head shape to resolve weird corners
  • (taft): cropped closer to face, mustache volume increased, mustache/face contrast increased
  • (twss): added contrast and brightened
  • (wtf): enlarged to show face better
  • (yodawg): added contrast and brightened
  • (yuno): resized because he was too small
  • (zoidberg): resized because he was too small

New global emoticons

  • (awesome)
  • (aww)
  • (awwyiss)
  • (badtime)
  • (bicepleft)
  • (bicepright)
  • (borat)
  • (carl)
  • (catchemall)
  • (chef)
  • (cookie)
  • (corpsethumb)
  • (disappear)
  • (doh)
  • (donotwant)
  • (downvote)
  • (drool)
  • (evilburns)
  • (excellent)
  • (feelsbadman)
  • (feelsgoodman)
  • (finn)
  • (ftfy)
  • (giggity)
  • (goldstar)
  • (haha)
  • (huehue)
  • (hugefan)
  • (jake)
  • (meh)
  • (motherofgod)
  • (nice)
  • (noidea)
  • (notit)
  • (ohmy)
  • (paddlin)
  • (rockon)
  • (salute)
  • (sap)
  • (standup)
  • (taco)
  • (tayne)
  • (thatthing)
  • (theyregreat)
  • (toodamnhigh)
  • (unacceptable)
  • (upvote)
  • (waiting)
  • (whoa)
  • (yeah)
  • (youdontsay)

Elasticsearch at HipChat: 10x faster queries

Last fall we discussed our journey to 1 billion chat messages stored and how we used Elasticsearch to get there. By April we’d already surpassed 2 billion messages and our growth rate only continues to increase. Unfortunately all this growth has highlighted flaws in our initial Elasticsearch setup.

When we first migrated to Elasticsearch we were under time pressure from a dying CouchDB architecture and did not have the time to evaluate as many design options as we would have liked. In the end we chose a model that was easy to roll out but did not have great performance. In the graph below you can see that requests to load uncached history could take many seconds:

Average response times between 500ms-1000ms with spikes as high as 6000ms!

Identifying our problem

Obviously taking this long to fetch data is not acceptable, so we started investigating.

Hipchat Y U SO SLOW

What we found was a simple problem that had been compounded by the sheer data size we were now working with. With CouchDB we had stored our datetime field as a string and built views around it to do efficient range queries; something it did very well and with little memory usage.

So why did this cause such a performance problem for Elasticsearch?

Well, an old and incorrect design decision resulted in us storing datetime values in a way that was close to ISO 8601, but not entirely the same. This custom format posed no problem for CouchDB as it treated it as any other sortable string.

On the other hand, Elasticsearch keeps as much of your data in memory as possible, including the field you sort by. Since we were using these long datetime strings it needed much memory to store them: up to 18GB across our 16 nodes.

In addition, all of our in app history queries use a range filter so we can request history between two datetimes. For Elasticsearch to answer this query it had to load all the datetime fields from disk to memory for the query, compute the range, and then throw away the data it didn’t need.

As you can imagine, this resulted in high disk usage and cpu wait i/o;

But as we mentioned earlier, Elasticsearch stores this datetime field in memory, so why can’t it use that data (known as field data) instead of going to disk? It turns out that it can, but only if you are using a numeric range for your index, and we were using these custom datetime strings.

Kick off the reindexing!

Once we identified this problem we tweaked our index mapping so it would store our datetime field as a datetime type (with our custom format) so all new data would get stored correctly. We leveraged Elasticsearch’s ability to store a multi-field which meant we were able to keep our old string datetimes around for backwards compatibility. But what about the old data? Since Elasticsearch does not support mapping a change onto an old index, we’d need to reindex all of our old data to a new set of indices and create aliases for them. And since our cluster was under so much IO load during normal usage we needed to do this reindexing on nights and weekends when resources were available. There were around 100 indices to rebuild and the larger ones took up 12+ hours.

Elasticsearch helped this process by providing helper methods in their client library to assist in our reindexing. We also built a custom script around their Python client to automate the process and ensure we caused no downtime or lost data. We hope to share this script in the future.

The fruits of our labor

Once we finished reindexing we switched our query to use numeric_ranges and the results were well worth the work:

Going from 1-5s to sub-200ms queries (and data transfer)

So the big takeaway from this experience for us was that while Elasticsearch dynamic mapping is great for getting you started quickly, it can handcuff you as you as you scale.  All of our new projects with Elasticsearch use explicit mapping templates so we know our data structure and can write queries that take advantage of them. We expect to see far more consistent and predictable performance as we race towards 10 billion messages stored.

We’d love be able to make another order of magnitude performance improvement to our Elasticsearch setup and ditch our intermediate Redis cache entirely. Sound fun to you too? We’re hiring! https://www.hipchat.com/jobs

HipChat + Elasticsearch guest list expanded

You want it – You got it

75 more slots to attend SF Elasticsearch’s Meetup on November 18th

Capacity reached! But we will be recording the talk and sharing it with the Elasticsearch community.

At HipChat, we’re big fans of Elasticsearch. It’s helped us scale our infrastructure. The title of this talk will be “Heavy Lifting: How HipChat Scaled to 1 Billion Messages.”

Originally, we thought 75 people might register for the talk. But, the Bay Area Elasticsearch community is bigger and more passionate than we anticipated. So we’re doubling our guest count to 150 people. RSVP now to save your spot!

Note: We also changed the time of the Meetup. The main talk will begin at 6:30pm. Not 7pm.

What you’ll learn

One of the keys to our success has been building a scalable backend. Elasticsearch has played a big part in this.

We plan to talk about how we scaled to sending over 1 Billion messages and how Elasicsearch allows us to index and make in near-realtime search possible for all 1 billion messages. We will also discuss our future with Elasticsearch — using it for more than just search and logs. We’ll share some tips and things we learned (and are still learning) about our transition to Elasticsearch.

Why you should attend

  • Free pizza, beer and sodas
  • A chance to talk with our engineering team, including HipChat Founders
  • You’ll get some of HipChat’s popular meme stickers
  • Chance to win limited-edition HipChat t-shirts
  • Learn something cool

Did we mention we’re hiring?

Full disclosure: we know the Elasticsearch community is packed with incredible engineering talent. We’d love to talk with you about current and future opportunities to build the best damn group chat application for teams. We’ll have one of our talent coordinators on-site in case you have questions about the company, our values or the hiring process.

Can’t make it? You can always submit a resume to jobs@hipchat.com. We (heart) smart people.

How HipChat scales to 1 Billion Messages

When Atlassian acquired HipChat, we had sent about 110 million messages. Today that number has grown tenfold, and it’s still growing at a record pace. Scaling to meet these demands has not been easy but the HipChat Ops team is up to the task. We thought it’d be cool to shine some light on what it took, infrastructure wise, for those who are curious about this kind of stuff. In this post, we’ll highlight how we use CouchDB, ElasticSearch, and Redis to handle our load and make sure we provide as reliable a service for our users as possible.

Road to 1 billion messages

Getting off the Couch to scale chat history and search

Originally HipChat had a single m2.4xlarge EC2 Instance running CouchDB as datastore for chat history and Couch-lucene for search, a fine set up for a small application. However, once we started to grow, we began to hit the limits of CouchDB and AWS instance size, and we’d be out of memory daily. We kicked off a project to look at other data stores and indexers to solve this problem, and we concluded that the first step involved upgrading our search indexer. So we kicked Lucene to the curb in favor of Elasticsearch.

Heeding the advice of the Loggly team, we set up 7 Elasticsearch index servers and 3 dedicated master nodes to help prevent split brain. Elasticsearch lets us add more nodes to our cluster when we need more capacity, so we can handle extra load while concurrently serving requests. Moreover, the ability to have our shards replicated across the cluster means if we ever lose an instance, we can still continue serving requests, reducing the amount of time HipChat Search is offline.

For chat history, we still use CouchDB as our datastore, but we are beginning to hit limits with AWS trying to fit everything into a single instance. Just prior to hitting a billion messages, we noticed that during compaction, our EBS volume storing our CouchDB files was running out of disk space. AWS limits EBS volumes to 1TB, so as a stop gap solution, we decided to try out EBS Raid. We at HipChat don’t believe in one-off solutions, so we used a slightly hacked version of AWS using the Opscode Chef cookbook to automate the process of creating, mounting, and formatting our RAID arrays. Our hack can even rebuild the RAID using EBS Snapshots. True webscale stuff.

Currently, we pull data from couchDB using a custom ruby import script, but since Elasticsearch has treated us so well, we are looking to replace CouchDB with just Elasticsearch. If you want to hear more about this, we plan on giving a talk about Elasticsearch at a meetup here at Atlassian.

Caching in on Redis

We at HipChat use Redis a lot, caching everything from XMPP session info to up to 2 weeks of chat history. Originally we started with two Redis servers, one caching stats and the other caching everything else, but we soon realized that we’d need more help. Today, we shard our data over 3 Redis servers, with each server having its own slave. We continue to dedicate one of these servers to hosting our stats, while leaving the other two to cache everything else.

However, even with these changes, we found that we had to upgrade our Redis history instance size as we were running out of memory close to our billion message milestone. We will continue to improve the scalability in this area of the HipChat architecture, so we can handle load and ween off our dependence on Redis clustering to mitigate single points of failure.

Future

This is just a highlight of some parts of the HipChat infrastructure we needed to tweak to help us reach 1 billion messages. We still have a long ways to go to scale HipChat for our growing enterprise needs – improving our Redis architecture for example. A more robust system, increasing performance of our code, and mitigating or removing Single Points of Failure are large objectives that our Ops team look forward to tackling in the coming months.

If you want to learn more or think you can help us scale HipChat better, I suggest you come by our meetup. If you can’t make it to that, feel free to submit your resume here. Our team is growing fast, and we would love to have you on our team.

HipChat search now powered by Elasticsearch

We recently announced new search improvements in HipChat that support advanced searching of chat histories. At the same time, to support the scale at which HipChat is growing we needed to rethink our search architecture. The result? We switched to Elasticsearch.

Previous Setup

Originally, HipChat search was powered by a single AWS instance running couchdb-lucene. This was acceptable in the early days. But we had the issue of a single point of failure for our search system.

As HipChat grew, we needed a bigger and bigger AWS instance – to the point that we were using the 2nd largest memory instance AWS had. Even then we  experienced periods of search outages preventing our users from searching all because we have a single instance with no redundancy.

Say Hello to Elasticsearch

We determined our previous setup was not sustainable so we kicked off a project to find a new search engine. After kicking the tires on a few solutions we landed on Elasticsearch.

Why Elasticsearch?

  • It is built on top of apache lucene so it is familiar to us
  • It supports distributed nodes allowing us to run multiple nodes on different AWS availability zones
  • It supports robust plugins, including one allowing AWS node discovery
  • We could roll it out with as little impact to users as possible

You can read more about all the great Elasticsearch features here.

Deploying Elasticsearch at HipChat

Search error percentage before and after Elasticsearch

Rolling out Elasticsearch at HipChat with as little impact to users was a key goals of ours. During the evaluation phase we wrote a script that duplicated our search queries in production and ran them against the search engines we were testing, logging any differences in the response and logging response times to statd/graphite. This allowed us to figure our which service could handle the load we generated.

Working with the Elasticserch consulting team we determined that the standard couchdb-river would not work for us so we built a custom ruby importer to support the type of performance we needed. We hope someday to open source this script but currently its very tailored to our needs.

At HipChat, we leverage feature toggling for many of our features so when we roll out something new we can enable it for only certain groups. This allows us to test at scale without causing disruption to all of our customers. We used this feature to roll out Elasticsearch slowly to a small sub-set of customers (Thanks to all the customers who helped us Beta test Elasticsearch!). Once we got comfortable with Elasticsearch and saw it was beating out couchdb-lucene we decided to roll it out to all of our customers with minimal impact on end users.

The Ops team at HipChat is still working on other ways to scale HipChat to make it as reliable as possible!

Happy Searching all!

Make HipChat your Team’s Command Center

2 years ago Jeff Park 4 Comments

Our customers love HipChat because it’s so easy to extend. HipChat connects to over 45 tools that your company uses every day. Here are 5 ways to make HipChat your team’s command center and stay on top of everything your team needs to know about.

1. Connect to JIRA and Pivotal Tracker

Track issues with JIRAEvery company has projects they need to manage. Keeping up with the issues your team needs to address for these projects is a breeze with HipChat. Integrate with project management tools like Atlassian JIRA or Pivotal Tracker, and receive updates whenever an issue is opened, commented on, or resolved.

2. Collaborate on code with Bitbucket or GitHub

Collaborate on Code with BitbucketWith software eating the world, your team most likely has some code to work with. Tie in your repositories from Github and Bitbucket to receive a notification whenever a teammate pushes code, creates a branch, opens a pull request and more.

3. Builds and deploys with Bamboo or Jenkins

Build and deploy with BambooDeploying clean code is critical to your team’s success. Integrate HipChat with a continuous integration tool like Jenkins or Atlassian Bamboo and be the first to know whenever your code passes or fails a build. If your team deploys with Heroku, you can have HipChat send you a message to let you know a team member deployed your app.

4. Tackle customer service with UserVoice and Zendesk

Provide kick-ass customer service with UserVoiceNo matter what your company does, the customer is critical to your success. Stay connected and provide immediate kick-ass service to your customers by bringing UserVoice and Zendesk into HipChat.



5. Missing something? Zapier has you covered

Using Zapier, you can integrate HipChat with any other tool your team uses. Zapier supports 200+ services and has a simple interface, so your team doesn’t have to spend any time writing code to get these integrations set up. Start getting notifications in HipChat in just a couple steps. Check out the list of services and instructions to set up your Zaps, and when you’re ready, sign up through this link to get an extra 100 tasks per month!

Zapier

There’s no need to fumble around to stay updated of what’s going on. Keep a pulse on your team’s activity by integrating the tools you use with HipChat. With one service sending notifications, your team spends less time distracted and more time shipping awesome products.