Loggly

Close

If you don't know the subdomain for your account, you can retrieve it by resetting your password. If you don't have an account, signup now.

Big Data Uncovered?

Posted 7 Feb, 2012 by Kord Campbell in Business and Startup

I recently came across an eWeek article titled 2012: A Cloudy Year for Big Data by Frank Ohlhorst. You could easily say I have a few opinions on the matter of big data!  :)

First, I agree with Frank’s first notion that big data is neither big or new.  The fact is, I've been saying things like "Dude, that's a ton of data!" since I started notching out the opposite sides of floppies back in the 80s.  Remember these?

Ohlhprst quickly follows up his vague handwaving that ‘big data’ a new term with, “For most of its existence, big data has been out of the reach of small and midsize businesses (SMBs) because the storage and processing power needed to make this technology work is too expensive.” 

Companies have been doing for years what they need to do better business, regardless of whether or not it’s expensive. In manufacturing, the costs of a small company optimizing on how to efficiently making tons of a cheap product can actually be quite a bit more expensive than a larger company making a few units of a complex product.  In the same vein, smaller business may have more complex business optimization processes than larger ones, and require relatively larger amounts of data are required to solve those problems than with larger companies.

I agree with Frank that small business typically don't always have the resources necessary to solve massive scale problems, but again the problems are relative.  For example, small software startups don't have project managers where larger ones do, not because they can’t afford them, but because they really don’t need them in a full-time capacity.    I think this may be part of why SaaS services have been a huge hit and the term cloud has taken off because of it.  SaaS allows companies to tackle a wide variety of problems across the entire business, all the while providing cost effective high tech solutions to solve problems in a way you could never have done before.  For the first time in history the quality of business processes is experiencing sustained growth.

“These new cloud-based capabilities are on a growth path and are creating more opportunities for even the smallest of businesses to leverage big data without the traditional expenses of compute farms and massive storage arrays.” 

Yes.  However, compute farms and storage aren't the main thing that these companies need. They need access to the raw data that contains the data about their business, and the tools to extract the data in which they can take action. Figuring out your company’s problems requires brain power, understanding, data and tools. CPU and disks don't solve complex problems. People do.

It's All About Application Analytics

Ohlhprst also describes big data analytics as being comprised of  three primary elements: volumes of unstructured data, processing power and algorithms. However, big data doesn't always imply unstructured data. Log files, or what Loggly calls Big Time Data™ typcially contain a large amount of structure.  Dealing with structured data isn't always easy, and if you write software that 'expects' a certain structured format, your analysis can sometimes be broken or flawed if it encounters data that doesn't fit the structure you coded for.  One way around this problem is to apply extra meta data to the data set.  One technique to solving to this problem is adding a search index to the data, which is the approach Google pioniered and what Splunk and Loggly do for log files or time series event data.  By being able to do text search data, and interact with it in realtime, or near real time, the user can optimize on solving the problem 

Ohlhorst continues, “For it to be true big data, there has to be lots of it, and most SMBs don’t generate that volume of data internally, which leads them to seek out alternative data sources. Here, the cloud delivers.” Not true.  Big data should be defined as an amount of data that a human can not reasonably digest. Generating large amounts of meaningful data is actually a bigger problem. Again, it's understanding the problem you have before you can solve it.

Yup. Ohlhorst explains that throughout 2012, data sets and others can be expected to grow exponentially - “The amount of data being generated globally increases by 40 percent a year", according to the McKinsey Global Institute, a data analytics research firm.  True.  The access of this data, mostly through the web, generates vast amounts of data as well. Ohlhorst continues that information needs to be organized, sorted and processed- and that takes computing power.  Frankly nowadays, CPU is cheap enough that most of these problems can be solved on your laptop. Fast CPUs for crunching 'big data' aren't the problem any more than a search engine's main problem is crawling for data. The real bottleneck is adding meaning to the data that a customer can digest and make actionable.

PaaS/IaaS Accessibility Is a Problem 

I’m glad that Ohlhprst recognized that Amazon isn’t the only one in the game in offering private cloud-based big data analytics platforms. He believes that since this technology is designed as a complete platform and not as a service, these platforms are still out of the reach of the SMB market.

Ohlhorst is right that these platforms are out of reach - but not just because they are designed as a complete platform and not as a service. I think it's because SMBs don't know they need it, don't have the data to put on it, and don't have the resources to manage it.  There are plenty of hosted solutions out there (see SalesForce and their app marketplace) that provide some serious horsepower to the most important task - managing a company's contacts.

And of course, there had to be a Splunk mention in his article.  Splunk sells expensive enterprise software.  Their software is often times the most expensive piece of software a company has ever bought. Sounds like Oracle, eh?  They aren't converting big data analytics into cloud services; they are simply taking their product and making a slimmed down version into a cloud offering they can generate leads with.  Any serious big data customer they land will have to buy that very expensive solution and install it on a bank of computers and then pay people to manage it.  SaaS is not what Splunk is taking to the market when they go IPO.  It's their hellaciously expensive software licenses.

Big Time Data™.  It's in the future of your small business.

No Comments   |   Leave a Comment   |  

Loggly's Outage for December 19th

Posted 19 Dec, 2011 by Kord Campbell in Business and Startup

Sometimes there's just no other way to say  "we're down" than just admitting you screwed up and are down.  We're coming back up now, and in theory by the time this is read, we'll be serving the app again normally.  There will be a good amount of time until we can rebuild the indexes for historic data of our paid customers. This is our largest outage to date, and I'm not at all proud of it.

So What Happened?

Sometime yesterday afternoon ALL of our machines on Amazon's East region, availability zone 1d, were rebooted by AWS staff.  Originally we stated we had not received reboot notices from Amazon, but the truth is that (4) of the staff here, myself included received two separate vague notices, one from about 10 days ago, and another from 3 days ago, which stated 'some or all' of our instances were scheduled to be rebooted.  These notices were found in our spam folders on Gmail, placed there with a very large red notice reading: "Warning: This message may not be from whom it claims to be. Beware of following any links in it or of providing the sender with any personal information."  Meh.

Loggly uses a variety of monitoring mechanisms to ensure our services are healthy.  These include, but are not limited to, extensive monitoring with Nagios, external monitors like Zerigo, and using a slew of our own API calls for monitoring for errors in our logs.  When the mass reboot occurred we failed to alert because a) our monitoring server was rebooted and failed to complete the boot cycle, b) the external monitors were only set to test for pings and established connections to syslog and http (more about that in a moment), and c) the custom API calls using us were no longer running because we were down.

Combined, these failures effectively  prevented us from noticing we were down.  This in of itself is was the cause of at least half our down time, and to me, the most unacceptable part of this whole situation.

The Human Element

The other cause to our failures is what some of you on Twitter are calling "a failure to architect for the cloud".  I would refine that a bit to say "a failure to architect for a bunch of guys randomly rebooting 100% of your boxes".  A reboot of all boxes has never been tested at Loggly before.  It's a test we've failed completely as of today.  We've been told by Amazon they actually had to work hard at rebooting a few of our instances, and one scrappy little box actually survived their reboot wrath.

While some might go on a rant about how 'normal' failures don't affect 100% of your boxes the truth is that any and everything (including an army of reboot monkeys) can be expected to happen to your servers if you wait around long enough.  The trick to being good at running a reliable service is to architect around any number of everythings that could happen to your service and build for it.

In this case we didn't ever build the workaround simply because the system we run - a combination of 0MQ+Solr+Zookeeper+Loggly Special Sauce - makes it extremely challenging to survive a complete failure with more than 1/2 of the cluster missing.  With other challenges facing us, we decided to live with the risk. Now we're dealing with the fallout of our decision.

So, How Do We Make This Right?

Single instances of Loggly's search cluster can't be spread across multiple availability zones or regions due to the amount of data we push around, latencies between the search nodes, and the lack of support in our system for redundant indexes.  We've been OK with those limitations in the past simple because we normally archive data to S3 when we catch it, and we are capable of rebuilding indexes on the fly if we lose one or more indexers.

The first step in addressing this is to start sharding our customers across multiple Loggly deployments.  This will prevent further outages to the entire customer base.  The second step is to start doing Loggly deployments on dedicated hardware.  Because we keep large amounts of data on our boxes (the indexes) this is pretty much a requirement for fast recovery times when a deployment goes down.  While S3 is AWESOME for backups, it sucks big-time for rebuilding a large amount of search index data.

The second step is to ensure more robust external monitoring.  With multiple deployments, this issue becomes less of an issue, but clearly we need more reliable checks than what we rely on with Zerigo or other services.  Sorry, but simple HTTP checks, pings and established connections to a box do not guarantee it's up!

Finally, we accept full responsibility for the impact to our customers.  We will be in touch with our paid customers sometime over the next week to address compensation for this outage.

We welcome feedback below, and encourage useful criticism of our architectural choices.  All I would ask is that you consider Loggly's infrastructure isn't the same as yours, and I've greatly simplified the reasons for not being more redundant in our deployments.  We can, and will, endeavor to do better in the future.

Kord Campbell, CEO

15 Comments   |   Leave a Comment   |  

Automated Integration

Posted 16 Nov, 2011 by Mike Blume in Business, Code, and Startup

One of the fundamental challenges of distributed coding is deciding what/when to integrate. Sure, that patch your colleague just sent you looks good, but is it actually ready to go into master? At Loggly, we've been feeling our way towards a disciplined integration process. A year ago, our frontend developers were all making commits directly to trunk in a single SVN repo. Once every few weeks, we'd run `svn up` on our servers, and hope for the best. Today our code goes through peer review, unit testing, and static analysis before it even touches our master branch.

Like most projects these days, the process starts on github. Fork. Push a feature branch to your repo. Open a pull request. Go through a couple rounds of discussion and revision. Merge. Every change to our code goes through this process. At first we thought it would slow us down, that we'd want pull requests for the nontrivial code and to just push to master for the easy stuff. After just a few days, we found the pull requests were slowing us down not at all, and that we all enjoyed the greater transparency into our colleagues' work.

Once we merge, the automation kicks in -- our default integration branch is 'proposed', so clicking merge doesn't actually get the code into the master branch. Jenkins polls our 'proposed' branch once a minute, then runs a simple preflight script on the code.

Rather than keep that preflight in a jenkins configuration page, we have it checked into the codebase so that any developer can run it too; this way there's no excuse for breaking the build -- you should have seen it break locally =P

Here's our preflight script. Let's go through it line by line.

DIR="$( cd "$( dirname "$0" )" && pwd )"

APP=$DIR/..

First we figure out where we're running, so that we can find the other scripts distributed with the app. 

$DIR/purge_pyc

Next we purge pyc files. This is done because if a user recently switched from a branch which contained files which don't exist in our branch, the pyc files may still be around, and may be found by the interpreter. 

$DIR/syncenv

Next we run a script to sync our python virtual environment, and ensure all requirements are present. 

$DIR/runtests

Here, of course, we run our unit tests. Each run prints a coverage report, so that as we recover from our testing debt, we can measure our progress.

&& $DIR/lint...

Next, and this is important, we run pylint over the parts of our app that we expect to pass with no warnings. As we clean up our app, we continue to add modules to this list. Pylint does a few useful things for us. It looks for trivial name errors of the kind that could quickly cause code to stacktrace -- using a module without importing it, etc. It also enforces certain kinds of coding discipline. Our functions and modules can't exceed a certain length. The cyclomatic complexity of our functions is limited. 

If all of this passes successfully, Jenkins automatically pushes the checked-out commit to master, which is where we base our development. Thus, we're always basing our development on known-vetted code.

If any of it fails, Jenkins still has a couple more tricks to pull. Here's our on-failure script:

This comes in two parts. The first runs a standard-issue git-bisect between origin/proposed and origin/master. Since origin/master has already been vetted by jenkins (that's how it became master), we know there'll be a regression somewhere between the commits. This goes into the session output, and is e-mailed to the relevant committers. Next, we roll the proposed branch back to the already-vetted master branch. Whatever pull request broke the build will have to be re-made from scratch.

 

1 Comment   |   Leave a Comment   |  

PagerDuty, Loggly, and Alert Birds

Posted 28 Oct, 2011 by David Lanstein in Business and Code

Well, I'm back, and this time I'm here to talk about an awesome product that we use all the time, PagerDuty.  We use it internally for our own alerting (as do a number of Fortune 500 companies along with a million other startups), but we've also integrated it into Alert Birds, which is our alerting tool.  With Alert Birds, you can configure saved searches that run against Loggly, and you'll run those searches over a period of time that you've selected, and Alert Birds will escalate alerts in PagerDuty.  Before you can do any of those things, however, you need to set up the PagerDuty endpoint:

 

After you've done that, the next thing you'll need to do is to configure a saved search, and then configure the alert that you want to run.  The search itself is pretty straightforward, it has a name, a search string e.g.

(this is why it's cool to send us JSON!), and a list of inputs and devices that you choose - you may want to run a particular search on only your web servers, for instance.  The interesting bit is the alert itself, which runs a search that you choose, but has a number of options as to what conditions consitute an alert, and what the message should be:

 

This is where PagerDuty comes in.  Although you can send a GET or POST request to an endpoint of your choosing with the alert data, triggering an alert in PagerDuty is far more useful, as they can SMS/email/phone you, and they also handle escalations and reporting.  So, in the example above, if my web servers are spewing 500 exceptions, I want my ops folks to get notified, provided there are more than 10 - I don't want to wake anyone up over a little blip!  I'm a nice IT manager like that.  Anyhow, once an alert is in a critical state, it will run your search every minute until you're below the threshold, and once that happens, Alert Birds will automatically resolve your alert in PagerDuty.

That's pretty much all there is to it!  You can find the docs on Alert Birds here, please do drop me a line at support@loggly.com if you need a hand, and until next time, happy alerting!

No Comments   |   Leave a Comment   |  

Announcing the Loggly Addon for Heroku

Posted 31 Aug, 2011 by Kord Campbell in Business and Log Management

I'm pleased as punch to announce the Loggly Add-on for Heroku is now in private beta!  The fine folks over at Heroku just emailed me the good news a few hours ago:

"Heroku is very excited to announce the availability of the Loggly add-on to our thousands of developers. Loggly is intuitive, easy-to-use and makes logging fun again by providing a rich set of features enabling users to search and analyze their logs."

It's been over a year since we first visited Heroku's offices to discuss providing a logging add-on for their platform users.  The result of both company's efforts over the last year is the first third-party Heroku logging add-on which leverages the power of the our highly scalable logging search engine and the sophistication of Heroku's new Logplex infrastructure.  

Simply put, it's awesomesauce in the cloud.

 
 

Solving a Big Problem

The challenge Heroku faced with customer logs centered around getting access to all of the logs out of a dyno's stack.  Heroku's stack can generate log events from the load balancer, cache, and database server, as well as logs from other add-ons, and more.  While Loggly customers have been able to send logs from the app layer for a while now using Ed Muller's super duper Logglier library, getting the remainder of the events from the stack required a rework of the way those events were routed around on the Heroku platform.
 
Earlier in the year Heroku released the first set of features based on this work, dubbing the project LogPlex and Open Sourced it over on Github.  This solution allowed Heroku and Loggly users to use the Heroku authored Logging Add-on to forward logs to a syslog port over on Loggly.  This worked well for getting access to events out of the remainder of the Heroku stack, and set the stage for Loggly to build a proper add-on for users to add to their Heroku account.
 
Without all the hard work from the fine folks at Heroku, we'd never been able to pull off writing our Add-on.  You guys rock!
 

Scaling is Hard

Scaling a sophisticated platform as a service offering like Heroku is a massive challenge.  There are brilliant people over at Heroku who have spent insane amounts of time working on scaling their platform to 100s of thousands of applications, all the while adding non-trival features like Logplex to their infrastructure.  It's a bit like changing tires on a fighter jet flying at mach 2.
 
Loggly has been spending time changing tires on jets too.  When we launched in February of this year we supported a paltry 2GB a day volumes on accounts.  In April we raised that to 4GB a day per account and in June we doubled that to 8GB a day, as well as releasing new pricing plans supporting custom volume and retention times.  Today Loggly has over 2,500 customers, and we just upped our volumes to 12GB a day per account in anticipation of our Heroku Add-on launch.  We'll continue to up our volumes over the next few months, and continue to add features which provide custom logging solutions for web applicatoin developers.
 
Be sure to keep an eye out for more Loggly features like realtime feeds and alerting real soon.
 
Keep on logging!

 

No Comments   |   Leave a Comment   |  

Siloam Springs' Logging Story

Posted 22 Jun, 2011 by Marie Schultz in Business and Log Management

Chris from Siloam Springs from Hoover Beaver on Vimeo.

Christopher Hobbs, senior system administrator for The City of Siloam Springs, Arkansas, talks to Kord about how he uses Loggly to debug and troubleshoot the dizzying array of systems he maintains for the city and police and fire departments. You can follow Chris and Siloam Springs on Twitter, and browse his public repos on Github.

No Comments   |   Leave a Comment   |  

App47's Logging Story

Posted 22 Jun, 2011 by Marie Schultz in Business and Log Management

Chris from App47 from Hoover Beaver on Vimeo.

Chris Schroeder, CEO of App47, talks to Kord about how App47 is embedding Loggly into their offerings to provide analytics and troubleshooting for mobile developers. Focusing on the enormous challenge of understanding user behavior, improving the experience and troubleshooting application crashes. You can follow App47 and Chris on Twitter.

No Comments   |   Leave a Comment   |  

Send Custom Metrics to CloudWatch's API

Posted 10 May, 2011 by Kord Campbell in Business and Code

A few week’s ago I wrote up how to implement simple alerting with Loggly and PagerDuty. This week I’m covering how to do something very similar with the new version of Amazon’s CloudWatch which they recently released.

Amazon doesn’t rely on a monitoring agent to collect the metrics for CW, so it’s literally a few clicks in the AWS interface to start using it. Data is collected by their pre-instrumented hypervisor and then forward to the CW service where it can be selected, displayed and alerted on by the user.

With the latest release of CW, Amazon provides new endpoints in the CW API which allow an user to send in custom metrics. These metrics can be used in combination with the hypervisor based metrics to build complex alerts and drive auto-scalability for applications based on EC2.

It’s that new functionality that I’ll be using to send data from Loggly to CloudWatch.

The Code

As always, the code for this post is parked on Loggly’s Github account. The cloudwatch.py file contains the signing bits required for talking to Amazon’s API endpoints, and some basic code for posting to the PutMetricData method. You don’t need the boto library for this, but it won’t hurt if you already have it installed.

The detailed instructions for setting all up are on the Github project page. Basically all you need to do to get this running is to get syslog-ng forwarding your web logs to Loggly, configure your Loggly credentials, and then enter your AWS_ACCESS_KEY_ID and AWS_PRIVATE_ACCESS_KEY_ID in the code.

You’ll need a few cheese shop libraries installed, including httplib2, simplejson and hoover, the Loggly Python library.

Set up a cronjob file that runs it periodically, preferably on an instance you are monitoring.

*/5 * * * * python ~/loggly-watch/main.py

The Result

The code above conducts a simple search on Loggly for all events being sent to the default input for your account. If all you are sending to that input is combined_access formatted log lines, then you’ll end up with hit counts sampled every 5 minutes from Loggly, offset by one minute to ensure we’ve indexed them properly.

The result is pretty impressive, with so little work involved. You can even do combo graphs containing metrics delivered by the AWS hypervisor.

Alarms

Once the metrics are flowing in, you can set alarms to trigger if they go over (or under) a certain threshold. In the screenshot below I’m monitoring for the term ‘exception’ coming in from my crappy blog which is hosted on AppEngine and which logs with my AppEngine async logging library.

The screenshot above shows where CW triggered an alarm for exceptions, then cleared itself after the threshold dropped below 4.

Monitor the Monitor

With Loggly and CloudWatch alerting, there are a whole host of monitoring and correlation use cases you can tackle with just a little bit of hacking. You can even alarm on the cronjob itself to ensure your monitoring is functioning and healthy. Here’s how.

Start by making sure your local syslog instance is sending data to Loggly, and then change your cronjob to pipe it’s output to logger:

*/5 * * * * python /home/kord/code/loggly-watch/main.py 2>&1 | logger -t cloudwatch-cron

Next, set up a search in the same main.py file you are calling with cron to search for a successful run of the cronjob that runs the search (that’s so meta it hurts):

Note: I’m keeping this example purposefully simple. In practice you’ll probably want to make this check little more sophisticated by ensuring the response from the Loggly server is valid or not, and that each search ran successfully.

Finally then create an alarm such that it triggers if the results number less than 1 over a 10 minute period.

Happy alerting!

1 Comment   |   Leave a Comment   |  

Bebo's Logging Story

Posted 9 May, 2011 by Kord Campbell in Business, Log Management, and Startup

Aren Sandersen, VP Operations for Bebo, came over today and had lunch with us. Afterwards, we sat down and chatted about how Bebo is changing their infrastructure, manages logs, and how they use Loggly to do debugging, alerting and operational troubleshooting with Loggly. You can view the video on your iPhone via broadband, or watch the mobile version as well.

You can follow Aren and Bebo on Twitter, or sign up for Bebo on their website.

1 Comment   |   Leave a Comment   |  

Everyone's Talking About AWS Being Down

Posted 22 Apr, 2011 by Kord Campbell in Business and Startup

Everyone seems to be blogging about how their service has ben impacted by Amazon’s AWS outage, or whining about how Amazon sucks, or explaining why their service was architected so well that it didn’t impact them, or why you suck if you didn’t plan for this.

As Loggly is based entirely on AWS and was only minimally impacted during the first few hours of the start of the outage, I figured I’d share exactly how we managed to do what we did:

We got fucking lucky.

That’s How We Roll

Loggly is based ENTIRELY on us-east. We run across multiple availability zones and don’t rely on EBS for anything other than backups of a few simple databases. Everything else is file based the EC2 instances and their drives are set up in a RAID-1 configuration for speed and slightly more reliablity. Our log streams are backed up to S3 every few minutes as they come into our proxies. We rely on RDS for the database for the user logins, which really ended up being the only thing affected.

I asked Jordan Sissel, Head of Ops and Senior Developer here at Loggly to describe exactly what happened when Nagios/Pagerduty went off a night before last. Here’s what he said:

I saw RDS (prod db) problems in the early morning just as the problems started, but by the time I started debugging it the problem went away. I was notified by pagerduty because beaveroil and some other checks were failing.

Otherwise we weren’t really impacted. We got lucky, I think. I kept my eye on service but it stayed happy during the AWS outages.

Worst case, it’s easy for us (assuming rightscale is functioning, which it wasn’t for some of the day) to migrate to different parts of EC2 due to our use of puppet and are lack of EBS usage (only our RDS uses EBS)

Planning for Failure

Jordan is right, we can pretty much do a Loggly deployment on any AWS region within 20-30 minutes. Because we use Zerigo for DNS, and because we keep short TTLs, we can switch out records and have them updated quick and redirect ALL our inbound and outbound traffic to the new deployment.

Of course that leaves the question about migrating data on our existing or failed indexers. Thankfully that’s not a huge issue for us because we can rebuild them using EMR from our S3 backups.

Before we launched Loggly’s public service, I mandated that Loggly must be able to rebuild failed indexers at will. The work required to support thid ended up delaying our public launch by at least 45 days. Now if we lose a box or have to move to another region, we can rebuild any (or all) of our indexers in at most a few hours. In theory we could continue to index new data coming into the system and historical search impact to customers would be minimal.

As my friend Clay Loveless put it so elegantly, “We may rip the rug out from under your feet at any moment.” If you haven’t planned for disaster striking, then you should go back and reassess your infrastructure. Hopefully we’ve planned well for just such a disaster.

<knocks on wood>

1 Comment   |   Leave a Comment   |  

Chris Wensel's Logging Story

Posted 20 Apr, 2011 by Kord Campbell in Business and Log Management

Last week Chris Wensel swung by the Loggly office, had lunch, and sat down to do a short video with me to talk about Cascading, what his company Concurrent has been up to, and to discuss how Hadoop is good for processing a crap load of logs. Check out the video below. You can view the video on your iPhone via broadband, or watch the mobile version as well.

You can follow Chris, Cascading and Concurrent on Twitter, download Cascading, and check out his company Concurrent, Inc. Be sure to keep an eye out for the new version of Cascading which is due out soon!

No Comments   |   Leave a Comment   |  

Logging Challenges and Logging in the Cloud - PodCast

Posted 16 Dec, 2010 by raffy@loggly.com in Business and Log Management

I was invited as a guest to the CloudChaser podcast with Matt Grant.

We talked about a number of interesting topics related to logging, cloud, and security.

Log Management Challenges

We discussed a number of log management challenges from log generation to security in the cloud. Here is a brief list of topics we talked about:

  • We first touched upon some issues with log file generation. I am discussing the lack of logging guidelines and the problems that brings with it.

  • How are logs analyzed? One of the problems it that it should really be the application owners that look at their logs. From a security point of view, security analysts should look at the overall picture. But they should not be the only ones looking at those logs. It’s impossible for them to understand all the logs on n intimate level.

  • Yet another problem is understanding the logs. Visualization is an interesting way of addressing that issue. Especially for reporting and exploration or discovery.

  • Large-scale log storage seems to be a problem. Is it? Make sure you setup use-case driven retention policies!

We touched upon a number of other topics. Here is a short list:

  • It seems that users are moving more and more into the application layer to collect logs. It’s not just the infrastructure layer anymore!

  • Availability, performance, etc. can be a great way of selling your log management budget instead of using security as a selling point.

  • Obviously we talked about Logging as a Service and Loggly in specific. A lot of logs are in the cloud or are being moved into the cloud ;)

  • Security and regulatory concerns for logging in the cloud are always a fun topic. We discuss this briefly. The upshot is that it often isn’t a show stopper!

But hey, listen yourself!

2 Comments   |   Leave a Comment   |  

Server Density

Posted 23 Nov, 2010 by Kord Campbell in Business, Log Management, and Startup

My old friend David Myton from Boxed Ice swung by the Loggly office the other day to say howdy. I sat down with him and did a quick Geek CEO video about bootstrapping, developing product, filling the sales pipe, and listening to him being wise-beyond-his-years about raising capital.

Server Density is now growing at 20% a month, enjoyes a super low churn, and just got to break-even over the weekend. While others are just dreaming about startups, or grinding away for the man, David is here living the dream.

David Myton, CEO of Boxed Ice from Hoover on Vimeo.

 

David shared with me that the company will be moving to the Bay Area in a few more months, once they grow a little more and get their work stuff sorted around. They’re more than welcome to squat with us when they do!

No Comments   |   Leave a Comment   |  

Big Data Gets Bigger

Posted 13 Oct, 2010 by Kord Campbell in Business, Code, and Log Management

Edited on October 14th, for 2 orders of magnitude bad math.

Big data is big news. Big data is a big problem, and big solutions for it can drive big revenues. Because big money is involved, more and more people are writing and focusing on how big of pack-rats we’ve become. There’s only one fact everyone seems to be missing: Big is relative, after all.

Big Data in the Past

Back in the 70s when I was a kid, my family’s oil business had one of these old clunky Burroughs which my mom not-so-fondly called Maribel. Whenever you wanted to invoice someone, you would load Maribel up with the customer’s account history from paper tape and then manually enter the new invoices. When the existing tape got full, you started a new one. The tapes were yellow, about an inch across and maybe 20 feet long.

We stored these tapes in envelopes, and the envelopes were in turn stored in vertical file cabinets. The hall outside my mom’s office was lined with these files cabinets and the cabients were literarily overflowing into the kitchen because there was no more room in the hall for them. If you estimated 5 bits per line, 72 lines per foot, and 20 feet of tape, that would give you roughly 1KB of storage on a single tape. Multiply that by 1000’s of these tapes and I figure we had a total of 1-2MB of data stored in about 100-200sq/ft of space.

Lots of customers, lots of tape, lots of work, and lots and lots of data. At least lots for 1976.

Your Future Arrived Yesterday

In 1996 my future had arrived. I was running a moderate sized ISP, and found myself buying a full-height 5 1/2" 8GB drive from Seagate for my news server. It cost me just over $2,000. With that one drive alone, I could have stored nearly 300 football field’s worth Maribel’s yellow tape based data.

Just last weekend at Lucene Revolution I gave some company my email address in exchange for a 8GB USB drive. I promptly tore it apart and extracted from it’s guts a sliver of a micro SD card. I could easily fit a few thousand of those cards in the space of that old clunky Seagate drive.

Earlier this year an article in Wired quoted IDC as saying, the size of the information universe in 2009 was 800 Exabytes. IDC went on to say 2020’s information universe was expected to be a staggering 35 Zettabytes; nearly 44 times as much data as there is in existence today.

For reference, one Zettabyte = one thousand Exabytes, one Exabyte = one thousand Petabytes, one Petabyte = one thousand Terrabytes, and one Terrabyte = one thousand Gigabytes. That means a Zettabyte = a million million Gigabytes!

That’s around 3 × 10^16 times as much data as we had in our office in 1976! If we decided to store it in file cabinets filled with yellow tape, our dystopian future’s 35ZB of data would take up the surface area of 546 earths. Say what?

It reminds me of something you’d see in a Douglas Adams novel, where a thousands of small, slightly cranky robots named Maribel are forced to shovel and store yellow tape rolls until they collapse into a pile of rust several millions years later.

Smell the Data Exhaust

Data exhaust can be defined as the machine events generated when a user accesses data stored on a system connected to the Internet, such as when a user access their photos on Flickr. Hadoop Karma indicates Flickr was storing 4 billion photos by the end of 2009. In aggregate, those photos are stored on thousands of servers and are being viewed by millions of users across the globe everyday.

In a simple senario where all the photos on Flickr were viewed once each by a single user, the logs would weigh in at just over 2TB! In reality, Flickr’s log volume probably exceeds a Petabyte or more a year for just the views of the lightbox pages alone. Facebook’s numbers are even scarier. In one month they’ll store 2.5 billion photos on their system. In turn, all the people viewing those photos will generate an order of magnitude more log data than Flickr even has in all the photos they’ve ever stored.

Even though we’re in private beta at the moment, we’re already seeing combined log volumes of around 3GB a day from 15 customers. A few of our customers, including About.me and Server Density are sending us near the max of what we allow on the private beta right now. We expect those volumes to go up considerably when we launch the public beta in December, where an average customer could be sending us anywhere from 1 to 5GB a day each. It won’t take long to start referring to our data in units of Petabytes stored.

While demand for storing all those logs is accelerating along with all the data being generated, the technology behind the storage and processing of data also continues to accelerate. Within a few months time, the technology we are developing at Loggly will provide companies a way to peek into these large volumes of log data – where they couldn’t before – and allow them to see exactly what their users are doing with all that big data.

Loggly’s features for search, reporting and map reducing will make dealing with these huge volumes as trivial as stuffing a yellow punch tape into an envelope, except we don’t need a robot named Maribel to do it.

And so the Universe ended.

3 Comments   |   Leave a Comment   |  

Logging as a Service (LaaS) - A podcast

Posted 8 Oct, 2010 by raffy@loggly.com in Business and Log Management

The other day, Andreas from Nemertes posted a blog on The missing piece of cloud security?. In his blog post, Andreas talks about how there is no real solution for handling logs in the cloud. Due to the fact that Loggly has been in private beta, I can’t really say that Andreas is wrong.

Instead (or in addition to) reading the rest of this blog, listen to the podcast that we recorded last Wednesday as a follow up to Andreas’ blog entry.

Loggly is the first logging as a service (LaaS) platform.

In his blog post, Andreas mentions a number of challenges associated with customers doing their own log management in the cloud: (I took the freedom to expand the list a bit)

  • Ephemeral virtual machines ask for log centralization.
  • Centralization of logs creates a single point of failure.
  • Installation and maintenance of a logging server takes time and costs resources (and money).
  • Knowledgeable (and expensive) personnel is required to configure logging solutions and maintain them.
  • Static solutions (installing your own log management tool) do not fit into the cloud model of “pay as you go”.
  • Building a scalable and reliable logging solution in the cloud is hard and expensive.

These reasons and a number of others are the foundation of Loggly. We are eliminating these problems for our customers.

Let me continue along Andrea’s blog post. He moves on to talk about the benefits and use-cases for a logging as a service platform:

  • security information management
  • regulatory compliance, incident response and post-incident forensics
  • control, visibility and resilience, while preserving “chain of custody” for audit purposes

In my view, these are fantastic use-cases for a logging as a service platform, but there are so many other uses, especially in the world of Web applications. There is a big big ecosystem around application logging that benefits greatly from a logging as a service platform. And to support that ecosystem, we will keep innovating and adding new features to our platform. A step into that direction are our new HTTP inputs that allow you to send HTTP posts containing log messages.

Want to know more? Sign up for the Loggly Beta ?

5 Comments   |   Leave a Comment   |  

Loggly Extended Beta

Posted 5 Oct, 2010 by Kord Campbell in Business and Startup

The extended private beta version of Loggly was pushed to production last Friday, October 1st, 2010. This release was planned in early August for release by end-of-September, so we did pretty well on getting it out when we did.

We called this release ‘the extended private beta’ because, well, we are extending the private beta to include more people who signed up on the beta registration page. We were so busy writing code for the release we couldn’t think of anything more exciting than this. Shoot, we were so busy I actually forgot to blog about the release!

If you haven’t gotten your invite yet, please be patient. We’re launching servers as fast as we can! Here’s a screen shot of the thing to tide you over:

Although we’re still in private beta, our roadmap for getting public beta is only a few months out. We’re planning on rolling out paid services for the private beta users toward the middle of November, and we should open up access to all registrations by the middle of December.

Min Product – Max Volume

In these days of rapid minimal viable product launches, we’ve been comparatively slow in launching Loggly’s service. Unlike other MVP offerings, Loggly will be expected able to handle customers that send in anywhere from a few MB to multiple GB of data per day of log files. Before we launch we have to test the systems will scale, and won’t melt under load. The extended private beta is part of that testing.

While we think we’ve got the scale issue licked for the moment, we’ve decided to add a feature which rate limits early beta accounts to 200MB/day of data. So we don’t block the upstream syslog servers, we’ll continue to accept and count event data, but will discard any data which exceeds the account limits. Limits are reset at midnight GMT, but if you need a higher daily limit please let us know.

Over the next few months we will raise these limits for the private beta users, and when we launch public beta sometime in December we’ll have a freemium account which will do somewhere in the neighborhood of 250-500MB/day. Paid tiers will be created to handle volumes above these rates. Paid tiers will also have additional features available to them such as S3 storage access, Hadoop processing, etc.

This release was the first of many steps toward making Loggly kick some serious ass when it comes to storing and searching the log files coming out of your infrastructure.

If you are interested in helping us beta test the product, please fill out the signup form and then give us a shout out on Twitter. We’ll see what we can do to get you on ASAP!

No Comments   |   Leave a Comment   |  

Our Solr system

Posted 9 Aug, 2010 by jon@loggly.com in Business and Log Management

Jon Gifford

I was one of three speakers at the Lucene/Solr meetup last month, co-sponsored by salesforce and Lucid Imagination. I don’t know how anyone at salesforce with a window gets any work done, considering the view – take a look at Grant’s photo to see what I mean. Thanks to Bill from salesforce for hosting, and the guys at Lucid for organizing things. You can check out the two other talks here, as well as talks from previous meetups.

UPDATE: I’ll be doing a slightly expanded version of this talk at Lucene Revolution in Boston on October 8th, incorporating some of the stuff I talk about below.

I got a few interesting questions and comments after the talk, so I thought I’d expand a bit on what was in my slides, which were perhaps a little dense.

“Log Search is highly skewed”

In the talk, I said that the most important search data is the most recent. When you have a problem, you’re far more likely to care about what happened in the last few minutes or hours or days than what happened a month ago. Thats not say that you’ll never need to search older data, just that most of the time, you won’t.

After the talk, though, it became obvious that I should also have said that our users are likely to use search in a way that is also pretty skewed when compared to “normal” search products. Basically, we expect that most people will use the system somewhat sporadically, but that when they do, its likely to be a pretty intensive session of  bug hunting. So instead of a fairly continuous search load, we get random spikes for a small subset of all the data we have in Solr. This is actually good for us, because we don’t need to keep all of the shards for all of our customers “hot” in Solr.  When a customer shows up, we can warm their data quickly, and let Solr and the filesystem cache do their thing to deal with shards that haven’t been used for a while.

The most important point here is that the overall system is going to be spending the vast majority of its resources on indexing, rather than searching. I can’t give you numbers, but if we end up spending anything more than about 5-10% of our cycles on search, I’ll be very surprised. This is not your typical consumer search product.

0MQ

I talked a bit about 0MQ, and said that we chose it primarily because its fast and lightweight, even though its possible that we could lose data if things break. I clarified this a bit in a comment on Sarah Allen’s blog because I want to make sure the message is that 0MQ is awesome, not that it loses data. Here’s the guts of what I said…

I wanted to clarify one point in your writeup, though, to make sure people don’t get the wrong idea about 0MQ. Yes, our implementation of 0MQ has a potential “leak”, where we can lose messages, but its a very uncommon case, and the impact is small. Specifically, if one of the solr nodes dies hard, we potentially lose any events that were sent to it in the last batch (0MQ batches to minimize comms overhead). In steady state, 0MQ is rock solid, 100% reliable, and faaaaaast.

 

Pieter (at iMatix) and I are currently discussing ways to solve the hard death problem, and I don’t anticipate it being a problem very long. As I said in the talk, 0MQ is unbelievably cool – if you haven’t got a project that needs it, make one up!

 

We sponsored some work to get the SWAP functionality in version 2 of 0MQ, and I’ve been blown away by the guys at iMatix – they really want 0MQ to work, and work well. My  throw-away comment prompted an email from Pieter asking for more details, and, as I said to Sarah above, we’re already looking at how to fix it.

 

Oh, and in case you’re wondering how fast a one-armed paper-hanger is, take a look at what The Word Detective says about it (scroll down till you see the “You missed a spot” section). Maybe I should have used “flat out like a lizard drinking” instead?

Sharding

The way we create shards by indexing, then merging, then merging again and again and again raised a few questions that are worth repeating…

To recap, we build small (5 minute) shards on our hot indexers. When we stop adding events to them, they get merged with older shards until we hit another size limit (30 minutes). They then get merged with even older shards, until we hit the next time limit (4 hours). And so on up the chain until they cap out at a week long. Along the way, we push indexes from box to box, to balance the load on the system as a whole.

The first question is fairly obvious: Why?

At first glance, it seems like we’re just creating work for ourselves. Surely we could just build the shards and use them as is, right? The problem is that we would have a lot of 5 minute shards floating around the system, and we already know that Solr starts getting cranky when you run a lot of cores in a single instance. So, why don’t we just build bigger shards? The issue there is that with the version of Solr we’re using, we have to reopen the index to make new data available, and we currently do that every 10 seconds (hence the “NRT + SolrCloud = Our Nirvana” in my slides). Since we have to do this, we’d end up with too many segments in the hot index, or (if we’re not careful with our merge factor) a lot of automatic merging that means that the hot index becomes unavailable for updates for too long for my liking. So, we got pushed into this approach by something that I’m hoping will soon be a thing of the past. I’m really looking forward to Michael Busch’s talk at Lucene Revolution which promises to remove the “N” from NRT. I’m not sure what is better than nirvana, but I’m hoping to find out soon

We may have been forced into doing things this way, but there is a lot of value in the model we have. In some ways, we’re taking over a part of Lucene (merging) that has been absolutely  invaluable, but can sometimes be a little difficult to control. We now have complete control over when and where indexes get merged. I probably should point out that we deliberately don’t do any merging on the 5 minute shards, and that we’re careful with the merge parameters on the larger shards to make the merges that do happen as efficient as possible.  The model also gives us a very simple index naming scheme based on time, which means we always know exactly where to find data for a time-constrained query. More on this in a bit…

The next question (from the meetup) was what is the overhead of all this merging?

Rather than give numbers, its worth thinking about whether we’re actually doing anything more than Lucene already does when you start building big indexes. I think the answer to that is that we’re actually just exposing and taking over the automatic behaviour, rather than doing something “extra”. So I think the real overhead is close to zero. Compared to building a bunch of shards in parallel using Hadoop, we’re certainly doing more work, but most of the Hadoop based systems I’ve looked at are geared more towards building indexes from a large existing corpus, rather than dealing with a real time stream.

My final comment on this is that since its all completely configurable, we’re not locked into any of the times I’ve mentioned above. Maybe when we move to NRT, or RT, we can bump the hot shard size up to hours or days, assuming that we’re still in control of merging. We shall see…

Constructive Laziness

Circling back to the first section, where I talked about how skewed we expect our search to be, the time-based shards gives us a very clean way to limit the impact of our search requests. Since we can constrain a search to a specific time period, its easy for us to identify which indexes we need to hit to satisfy the search. Our ideal search is for something in the last few minutes, which can be entirely served out of one or two of the five minute shards. We may have gigabytes or (hopefully) terabytes of index data for the same customer sitting around on our system, but if we can satisfy their request by hitting two small, heavily cached cores, then we’re in great shape. I wonder if life will be so kind to us?

Random aside: Synchronicity

Every now and then, things just come together in strange ways. A couple of weeks ago, Kord and I talked with Diego and Santiago from Flaptor, who are working on IndexTank. Diego and I were at LookSmart together years and years and years ago, but thats not the synchronicity. As we were talking, Diego said they were working on a “Nebulizer” which does automatic distribution of their index in the cloud. The day before the meeting, I’d pulled all of the code that deals with this in our system into a class named “TheDecider” (I’m still wrestling with a way to make misunderestimate() a useful method in this class). That evening I went to a NoSQL meetup, and met someone who is also working on the equivalent for their system. Maybe there is something in the air?

 

3 Comments   |   Leave a Comment   |  

Cross-Domain AJAX Calls To Query Loggly's APIs

Posted 27 May, 2010 by raffy@loggly.com in Business and Log Management

Last week I started playing some more with the Logging APIs from Loggly. For the first time I started embedding AJAX calls to the API into a Web application running on an external domain. Well, guess what happened? The browser barked at me telling me that I couldn’t execute a cross-domain AJAX call. I guess from a security perspective, that makes a lot of sense. However, I started thinking about how I could overcome this problem. The one way that I could have done it was not to use AJAX, but write some code server-side that would fetch the information format the Loggly API and then present it back to my Web application. I could even expose the information as an end point on the same domain that I then query from my application (see Figure).

Well, this seemed wrong. Why did we just design a really nice, RESTful API and then developers who want to use it have to build a server-side wrapper first. This didn’t make sense to me. So I kept digging. Fortunately, I found the solution. It’s called JSONP (JASON with Padding). Here is how it works and how you can leverage it in your own applications.

Let’s assume I am building an application at labs.loggly.com that will access the API located at loggly.loggly.com. With jQuery, my AJAX call looks as follows:

Now, if you do this, you will get the cross-domain error. However, if you just slightly change your call to include an extra parameter, it will succeed:

Note the newly added dataType parameter. That’s it? Yes, that’s it. It will work like a charm. No more cross-domain security issues. What basically happens are two things. First, the AJAX request that is executed has one more extra query parameter: &callback=?, where the question mark is some string that jQuery randomly generates. The second thing that happens is on the Loggly side. If the callback parameter is present, Loggly does not return the plain JSON element that you would expect, but it wraps it in a function call. Something like:


The next thing that happens is that when your browser gets the answer back like this, it will try to execute the function called jsonp12312312. jQuery internally handled that for you by creating a function hook for that function that points to the success function provided to the AJAX call.

 

That’s really it. We are looking forward seeing your applications that are using the Loggly APIs!

By the way, Loggly is using Django Piston for handling the APIs. The library automatically handles JSONP responses when a parameter called “callback” is present!

No Comments   |   Leave a Comment   |  

We Got Funded Again

Posted 18 May, 2010 by Kord Campbell in Business and Startup

On Monday, Loggly closed a $4.2M B round, with Trinity Ventures leading and True Ventures participating. As you may recall from my previous post, True led our initial seed investment, which was closed 5 months ago to the day.

My relationship with Trinity goes back over a year and a half ago – well before Raffy and I started thinking about doing a cloud based log management offering. Like many other startups, Trinity was started by two entrepreneurs. For years, Trinity’s motto has been focusing on early stage companies in specific technology categories, such as cloud computing and systems management.

Loggly is extremely fortunate to be working with Trinity and their bright team, and we greatly value the market experience they bring to the relationship.

History Lessons

I met Trinity through a fairly short introduction path. My good friend and former colleague, Dakota Sullivan, introduced me to a gentleman named Matt Strand in January of 2009. Matt and I had coffee at Crossroads Cafe in South Beach where I told him I was looking to join or start a cloud computing based company. Matt figured he should hook me up with a VC buddy of his, Dan Scholnick at Trinity.

Here’s the email introducing the two of us:

Dan <> Kord

Dan, please meet Kord Campbell. He is a serial entrepreneur interested in cloud computing, systems management, etc. with a few interesting ideas brewing. He is the tallest person I’ve met in at least a year or two.

Kord, please meet Dan Scholnick. He was one of the first employees at Wily and is now focusing on investments in your area of expertise for Trinity Ventures.

I think it’d be valuable for you two to connect. Let me know if there’s anything further I can provide, otherwise I’ll step back here and let you guys connect directly.

Best,
Matt


Matt was right about it being a valuable connection. Over the next year Dan and I would spend time together drinking coffee, chatting on the phone, and emailing each other about ideas in and around the enterprise and cloud computing space.

It was because of my conversations with Dan that Raffy and I were able to come to the idea of a cloud based logging service. Even when it came time to start pitching Loggly to others, Dan and Noel assisted us in honing our pitch, which eventually led to us being funded by True in a seed round.

Start Small, Go Big

When you are starting out, even the smallest conversation or the shortest email could potentially be the most important one you’ve had in years. Having an idea, growing it, and turning it into a business is a complicated process. That process takes time, and doesn’t happen over a matter of days, or even weeks, but instead over months and even years.

Our relationship with Trinity has been a long time in the works. While it may have appeared to happen rather quickly, Loggly’s efforts with Trinity started at the very beginning of its life.

In as much as your idea should evolve over time, your ability to convey the idea and the opportunity it represents should grow as well. I’ve lost count of how many times Dan has told me to ‘crisp up’ my presentation, discussed with me partnership negotiation strategies, or told me how to approach feedback with our current private beta testers, but I’m sure the hell glad he did.

Without investors like Trinity and True, it’s unlikely I’d be here telling you this story. You would be well to seek out these types of investors when you are looking for direction and guidance for your idea.

Now you’ll excuse me while we get back to coding. We have beta testers who have logs in need of indexing!

7 Comments   |   Leave a Comment   |  

Suffering SaaSitash

Posted 15 Apr, 2010 by Kord Campbell in Business and Log Management

Dave Rosenberg posted an opinion about cloud based logging yesterday on his Software, Interrupted blog. Dave starts out by mentioning Gartner predicted IT would spend more money on private cloud than public cloud through 2012. Here’s the exact quote from Gartner:

“Despite the economies of scale offered by public cloud providers, private cloud services will prevail for the foreseeable future while public cloud offerings mature, according to Gartner, Inc. Through 2012, IT organizations will spend more money on private cloud computing investments than on offerings from public cloud providers.”

This statement is a bit like NASA doing a press release announcing the moon is continuing to orbit the earth. Wow! The moon, still here next year? That’s awesome. Of course IT is going to spend more money on virutalization for the next few years. The success of the private cloud can be attributed to the fact virtualization has been around for a good while now, and is finally being pressed into mainstream use behind the firewall. Shoot, I think I was running Wine on some of my Linux boxes back in the mid-90s, which means virtualization has been commercialized for at least 15 years at the least. The idea of virtualizing an OS goes back well into the 60s. Come to think of it, so do I.

The public cloud, specifically IaaS and SaaS, is a grouping of emerging technologies. We’re just now starting to figure out how to wield it correctly for new business models. Poking holes in it at this point is simply rabble rousing by companies who’s business models are threatened by it and people who don’t understand it or have a use for it.

It’s a Complicance

Guy Churchward tries to make some good points in his talk with Dave, but at the end of the day, LogLogic is mainly an appliance vendor, and not only do they have big-time COGS to worry about, they also have to figure out how exactly a cloud customer is going to deploy their box on Amazon’s EC2 service. (Hint: They aren’t.) While you might be able to send logs back out of the cloud to an appliance behind the firewall, it’s unlikely to make economical sense to do so in the long term.

While there is a valid point in calling out cloud concerns, security itself is ALWAYS a concern, regardless of whether you run in the cloud or in your own datacenter. Frankly, with Loggly I’m likely better at storing and securing your logs than you are by yourself in your own data center, mostly due to the fact I’m under pressure by multiple people like you to provide a service which is expected at the outset to be secure. It’s no different than the pressure that Google has on them for securing your email, SalesForce for securing your leads, or Amazon securing your credit card info. We’re all culpable here for the security of your data.

Additionally, not all that cloudy data is created equal. A lot of the companies running in the cloud today are web based app companies, and the data they generate is often times very public in nature and not at all affected by compliance concerns. Do you think some user on Flickr cares if I stole all their comments? What about getting access to all those juicy tweets of mine? Oh wait, those are already in the Library of Congress. Nevermind, false alarm!

When IT Rains IT Pours

Log file data is already one of the largest sets of data on the planet. Logging alone in the public cloud is going to be absolutely staggering over the next few years. These trends are being driven by people switching to SaaS based applications, in turn who’s infrastructure either requires the elastic capabilities only the public cloud can provide, or who’s price point can’t be matched by private cloud offerings.

The elastic nature of these infrastructures means the logs which they generate need to be collected and stored in centralized location before the box that generated them disappears. There are many types of logs which are valuable to a company for understanding their business, and not so valuable for those data-thieving ruffians everyone keeps talking about.

While the security access data or net-flow information from public cloud vendors might alleviate the concerns of some consumers, I think there are much higher value adds to these offerings by being able to power availability and analytics services around a company’s application via a log file storage platform.

While the private cloud may continue to orbit peacefully for the next few years, the use of it for web based services will decay eventually, and it’ll be regulated to the more mundane stuff like storing my dental records and tracking my orders over on RadiatorBarn.com.

BTW, I’m still waiting on my radiator, Burton.

1 Comment   |   Leave a Comment   |  

RSA Security Conference – Cloud the Logging Killer App?

Posted 1 Mar, 2010 by raffy@loggly.com in Business

Logging - Cloud Kiler App

I am attending the RSA conference this week. The first session I attended was the Cloud Security Alliance (CSA) meeting. Reading some of the accompanying material and listening to some of the presentations and panels, I couldn’t help it but notice that the terms auditing and logging were all over.

Here is my attempt for an explanation of this. It seems that one of the reasons for this is the nature of the cloud. Think about it. You are in an environment where you don’t control much. You are in an environment where you cannot trust most of the infrastructure pieces. For example, if you are using AWS like we are doing at Loggly, you should generally not trust your AMIs (the OS images). Now, what do you do if you don’t trust someone? You observe them, you monitor them. That’s exactly what is and needs to happen in the cloud: You don’t trust the service. To mitigate this issue, you are going to monitor the service.

And to make this not just my explanation, here is what some panelists during the CSA meeting said:

“Loss of visibility in the cloud” – Scott Chasin, CTO McAfee SaaS Unit
“Lose control and still maintain accountability” – Ken Biery, Verizon Business.

Is the cloud the killer app for logging? And if that’s the case, how do you manage your logs in the cloud? There are hardly any cloud logging solutions out there. I think you see where I am going with this.

No Comments   |   Leave a Comment   |  

We Got Funded

Posted 17 Dec, 2009 by Kord Campbell in Business and Startup

Loggly just closed an A round with True Ventures on Wednesday. From start to finish, Raffy, Jon and I talked to over 20 capital firms, with fund sizes ranging from a few hundred thousand dollars to over a billion dollars invested. In all, we spent exactly 90 days on our capital raising efforts, starting with essentially nothing, and then authoring and tweaking the executive summary, financial model, and investor presentation as we went. Oh, and we wrote a crapload of code in there too. The Loggly Beta deadline waits on no man.

Perhaps it was fate that we spoke with Puneet Agarwal at True Ventures first. True has a massive amount of experience investing in and managing early stage companies. Their record of past successes speaks for itself, and their team has experience with over 100 early stage investments that have generated significant investment returns. Frankly, Raffy, Jon and I are extremely fortunate to be working with the True Ventures team.

That first meeting with Puneet was actually quite easy; we had no other expectations other than sharing what we were thinking with someone who knew the space well. That was the calm before the storm though – over the coming weeks we struggled with writing our investor deck, meeting schedules, market size expectations, investor lack of familiarity with our market, and became consumed with correctly casting the “going big” portion of our pitch.

Going Big

The best way to describe what “going big” means is to just be blunt about it. It means, “How are you going to make your idea – your startup – compete effectively in a multi-billion dollar market?” You know, how are you going to get big like Apple, or SalesForce, for example. “Say what?”

A bunch of you capital guys and seasoned entrepreneurs will nod your heads vigorously at this statement. “Yes, yes. You need to show how this gets really big!” And, for all practical purposes, you guys are absolutely right. For a largish VC, say with a fund of a billion or so dollars to invest, they HAVE to go with early startup guys who are going to go really, REALLY big. It’s not a matter of money, it’s actually a matter of time. If you have a BILLION dollars, and you invest a million in each company you fund, that’s a THOUSAND companies you need to talk to, investigate, vet, poke at, wrangle with, grow to love, etc. Yeah, no. That’s not going to work unless you focus on a given market.

You’d have to filter fast. Kick out the guys that MIGHT do a run rate of low 10s of millions a year (crazy, right?) because they aren’t big enough. Shoot for the guys who tell a good story about how they are going to turn into Twitter, or Facebook, and exit for billions. Find those guys! This results in an investor funding a bare handful of early stage startup each year, even if they say otherwise on their website.

It also serves another purpose, all those meetings with those small startups. It allows an investor to form early relationships with companies who are successful at getting through the valley of death. If an investor finds out a startup they talked to early on is doing well, has revenue coming in, is growing, and expanding to the “going big” event, then maybe they might need some more capital. Maybe it’s time to invest.

Entrepreneur Up

If you are doing an early stage startup and are going to raise capital, you need to toughen up a bit before you go out. Remember, it only takes one firm to believe in your idea, but you are going to get an inverse number of rejections before that event transpires. If you get a bad review from someone, take their advice in stride and figure out how it applies to you. Tweak as needed, and move on to the next firm to vet what you’ve discovered.

Above all, be honest with yourself and your assumptions and don’t give the investor who gave you a bad review a hard time. If you think you’ve been asked to prove something unreasonable, like how you are going to become a billion dollar company when it’s just the 2 of you and a 1,000 lines of code, then say as much to the investor. Don’t be afraid to say you don’t know, or restate what you do. Don’t be afraid to talk through how you get big with the investors.

After being asked how we got to a billion dollar valuation by one VC, I turned it around on him and asked how he knew he was going really big on his company (which exited for >$1B). His answer? “We didn’t know until we got there!”

Loggly is going to go big, of that I assure you. But first, we have a product to build, customer and partner relationships to forge, and problems to solve for storing a ridiculous amount of log files. Once we have these tasks behind us, we’ll have a great handle on how we’re going to go really, really big.

8 Comments   |   Leave a Comment   |  

Maximizing Conversions in a Freemium Webapp - Part One

Posted 19 Nov, 2009 by Kord Campbell in Business and Startup

We’ve been on the investor tour for the last month and a half at Loggly. We spent about a month working on the investor presentation, executive summary and revenue model to prepare. Based on the better feedback we’ve received, we’ve continued to refine the pitch and plan.

We’ve heard everything from “you need to show me how this is a billion dollar company” (lol wut?) to “move the team slide just below the problem statement”. Bad advice aside, I remain focused on our market size and, in particular, the user conversion assumptions that go along with it.

I’ll be transparent here. I expect Loggly’s model to yield a 10-20% conversion rate to paid customer from a signed up freemium account. While I have good justifications for those numbers, posts like this, by Derek Haynes of ScoutApp, would appear to contradict my assumptions, especially with comments like “The 1% rule is the Pi of freemium web apps!”. In all fairness he has an excellent point about focusing on retention optimization, but should we so blindly ignore our conversion pipe just because of the 1% rule says we’ll get the users?

Theories != Rules

As it turns out, the “1% rule” is based on observations on other websites and expected conversion from a site visitor to a paid user. Coincidentally enough, this assumption echos the controversial content creation theory and smacks of Chinese Math.

The main problem with this approach is it completely and utterly ignores the remainder of a given site’s user pipeline. Pipelines, or funnels, are the bread and butter of any sales driven organization. While sites like YouTube may have simple funnels, sites like Loggly are a bit more complicated, and a missed assumption somewhere along the way can turn your assumed 1% conversion into a .1% conversion fairly quickly. To illustrate, here’s a short list of what we’ll be tracking at Loggly (per month):

  • number of ad views on ad network

  • number of visits

  • uniques

  • % visits and uniques from ads

  • bounce rate

  • content consumed (price sheets/videos)

  • participation in the demo (low friction getting started)

  • freemium signups

  • use of a freemium account

  • use of certain features in the freemium account

  • conversion rate to paid accounts

  • use of paid account

  • logging rates

  • upgrades from one paid tier to another

It’s within this pipeline that we focus our attentions to help maximize conversions from visit to paid user.  Not surprisingly, the devil is in the details.

Care and Feeding of Your Pipeline

Portions of a pipeline need to have estimated percentages assigned to them, and then you need to actively monitor those conversions from step to step when the site goes live. I’ll start with the expectation of 1 conversion from 100 unique visits to a paid account and then project assumptions about the other portions of the pipeline:

  • 10,000 ad views yields 100 unique visits (1%)

  • 100 unique visits yields 20 content consumers (20%)

  • 20 content consumers yields 10 demo participators (50%)

  • 10 demo participators yields 5 freemium signups (50%)

  • 5 freemium signups yield 1 paid account (20%)

Notice this gives us our 1% conversion from unique visit to paid user, but still shows a 20% conversion from freemium to paid account. Now we do a breakdown for the intermediate steps, and find places we can optimize. Your mileage may vary, so make sure you understand your steps, and measure the effects of changes on your conversion numbers when you are able to do so!

Content is King

Based on prior experience, we know the subject material of Loggly should yield decent conversions from a visitor to content consumer, where the parties are interested in the offering. At previous engagements we were able to get 10K unique video views a month off 100K visits per month. Add in another 10% viewing pricing, examples, etc., and 20% starts to seem reasonable. By driving viewing content, we increase stickiness, educate the users about the service, and increase conversions to freemium signup.

Once the user is viewing content, we want to convert them to using the demo, and then on to creating an account later. Loggly’s demo will actually be a live demonstration of the service using the user’s own content, but it won’t require the user to sign up in the traditional sense. By flipping the signup process on it’s head, we get the user using the features first, then ask them for account information after they’ve decided they like the service. All in, and based on the frictionless signup, we expect 25% of users who viewed advanced content to convert to a freemium account.

Get Out of the Way!

Remember, your signup form is a HUGE barrier to users. If you require them to provide 12 fields of data to fill out, it’s that much less likely you’ll get a signup out of the deal. Just look at this signup form – I gave up after 5 minutes of trying to fill it out properly (picky password, clearing fields on submission, etc.), even though I’m highly interested in IBM’s offer. Most people won’t spend more than a minute or so filling out your signup form. The lesson here is to get out of the way so a user can start using your service quickly.

Finding Cash in the Pipe

Remember, freemium accounts are not the same as free trials. A free trial expires, but may contain all the features of a paid account. A freemium account doesn’t expire, and usually contains a subset of features of the paid account. By keeping the freemium account active for a longer period of time than the trial (and by keeping the cost of providing services to it low) you extend the amount of time a freemium user can convert to a paid user. While this may not show up in the first few months, over time it will have a compound effect on your conversions as users desire more features from the service.

At this point in the pipeline we are at a 5% conversion rate from visitor to freemium account, which so far seems to be a reasonable assumption. If we take the 10-20% estimated conversion rate to paid assumptions, then we arrive at a .5% to 1% conversion rate from visitor to paid account, which is within striking distance of our 1% goal.

In the next segment of this post, I’ll discuss analyzing how users use their accounts, and steps you can take to maximize conversion from freemium to paid.

Edit (11/12/09): Fixed my bullets in the second list.

5 Comments   |   Leave a Comment   |  

Three for the Win

Posted 5 Oct, 2009 by Kord Campbell in Business and Startup

I’m delighted to announce Jon Gifford as the new addition to Loggly’s founding partners. Jon will be taking on the roll of CTO at Loggly, and will acting as Chief Architect for the project. Jon hails from New Zealand, by way of Australia, and did stints at LookSmart, Technorati, Scout Labs, and recently his own startup, Minimal Loop. Jon is a search technology guru who is capable of accelerating search far beyond that of mere mortal men. If you don’t believe me, just try to go search for him on Google. Frankly, I have no clue how he does it. Trust me folks, a bunch of those those results are not him.

smashshittogethermachine

As for the rest of the team, Raffy is now officially our COO and Chief of Product (BTW, Swiss PMs rule), and I’ll be serving as our Geek CEO and Chief Evangelist. All three of us will be coding on Loggly over the next few months, and with a dash of a soon-to-be-announced designer, we’ll be assembling the framework needed for getting Loggly up and running in private beta over the coming months.

I’m really excited to be working with such smart and capable individuals. I can’t wait to start working with equally smart and capable partners and customers.

See you in your logs!

1 Comment   |   Leave a Comment   |  

Five Minutes of Fame

Posted 9 Sep, 2009 by Kord Campbell in Business and Startup

The first pass of the audio for the Loggly intro video was finished today. The clip will be used to further refine the wording of the script and start assembling the various visual elements we’ll need for shooting the video. What we should end up with is something similar to the works by Common Craft.

Not surprisingly, it took a TON of work to get the clip to its current state. Most of that work involved writing and editing the ‘script’ to sound like the actor was talking to an audience. I’m no script writer, so it took a fair amount of work with Brenda (the voice actor to whom I’m married) to nail down what sounded natural when spoken, and what didn’t. There are still some rough spots, but here’s the final version of the first draft if you want to take a listen. The script is here, less some final edits.

Back in the day, when John Leestma and I were involved in producing the Splunk Videos, we would do outtakes of the different developers doing their thing during filming. You had to do something to keep things entertaining – a 5 minute video would take all day to film and occupy 1/3 of the available conference rooms. It’s really too bad we didn’t publish them, some of them were side splitting.

2283715017_99d4829201

Speaking of cracking up, the wife and I have always enjoyed a good laugh together. Here’s hoping you get a hoot from our silly outtakes. All your logs are belong to us!

No Comments   |   Leave a Comment   |  

Branding Yourself Blue

Posted 21 Aug, 2009 by Kord Campbell in Business and Startup

I bought a Baby Blue Bottle mic for my wife a few years back. She’s an opera singer, and can actually belt out some serious tunes when the mood strikes her. With her pipes and this mic we’ve gotten some amazing takes. If you’ve never seen/handled/heard a Blue mic, well, it’s a thing of joy to behold.

3348339288_87ca8c9e6b

Now I’m working on Loggly fulltime, I realized we need a video which would explain the product in an easy to digest format, along the lines of the Twitter in Plain English video. It struck me I have all the bits I need to do this myself, including an amazing performer voice (albeit not using her operatic mode) and a kick ass microphone.

Videos are one of those things that help brand your identity, along with your blog posts, tweets, and of course, your product. The way in which you present your product to the people who use/consume it helps sear your identity forever in their minds. Case in point, Bluebottle Coffee. A friend of mine took me there a few weeks back and I was blown away by the preparation process. It was a thing of art really, much like the Blue Bottle Mic.

2344059179_37a154901e

Maybe we should name ourselves Blue Bottle Logs. Whaddya think?

No Comments   |   Leave a Comment   |  

Loggly's Birth

Posted 17 Aug, 2009 by raffy@loggly.com in Business and Startup

Hello, we are Loggly, a startup located in San Francisco. As you can tell from our Web site, we are in the very early stages of our startup. Shoot, we haven’t even had a chance to let our designer take care of the Web site yet! So, tread with care – anything you find here is early stages!

1 Comment   |   Leave a Comment   |  

Blog Categories

Search

Loading

Archives by Month