Hacker Newsnew | past | comments | ask | show | jobs | submit | pnachbaur's commentslogin

Out of curiosity, how do you weigh the straightforward cost of a service like Algolia vs the full TCO of bringing it in-house?


The way I would view it would be to outsource it until you can afford to hire someone to work on that full time. Search is a bitch. Elasticsearch makes it a bit easier, but if you're a startup and search isn't your primary business it's not a bad idea to outsource it to experts.


> The way I would view it would be to outsource it until you can afford to hire someone to work on that full time.

Exactly. I would outsource whatever the problem is that can be outsourced, in this case search to Algolia until we are far enough along that we can tackle it ourselves.

I don't think there's a real equation for it until you get to the point you can no longer afford it, but at that point you probably waited too long (excepting the cases where you've run into stratospheric growth).

I think as a team you should be looking ahead at growth estimates and making the judgement call to begin working on bringing it in house. Ideally you want the opportunity to run both side by side for a while.

And, honestly, what if you architect it or don't grow enough to make the cost a pain point? As long as the service provider is doing a good job, you could use the opportunity to extend your product into various other directions. Why build search if your focus is on something else and your provider is affordable?


I tend to think of Time Series data as being several orders of magnitude larger than 23 million data points per week (38 per second) but now I can't seem to find a good definition of Time Series data. Anyone have thoughts on the rough threshold between event data and time series data? I think of arrays of hundreds/thousands of individual sensors that take 10 measurements a second as "different" than user generated data that is time-ordered.


I agree, time series should be more like 1000 measurements taken 100 times a second. Industrial acquisition data is not the same thing as timestamped web log data.


Most boarding schools get compared to Hogwarts, and when you add a focus on 'technological wizardry' ... ¯\_(ツ)_/¯


We absolutely do not sell any data, and only use it internally to debug issues and optimize the system :)


You could use median instead of filtering out outliers explicitly so your averages don't get skewed.


Median might solve some of that issue, unfortunately it is computationally heavy to do median on a rolling basis, unlike average. Part of the reason the filters work so quickly is because when I add or remove items from the active set, I can just add/subtract from the total and the count for that one item. With median, I'd have to keep the active list sorted which even with a binary tree under the covers is still more expensive than two math operations. The filtering library under the cover is crossfilter.


If you go that way, be sure to read Jepsen: Elastic Search [1]

[1] http://aphyr.com/posts/317-call-me-maybe-elasticsearch


to get proper perspective, be sure to read all of their articles


Reminded me of Storehaus [1] but taken further. curious to see what K/V drivers they've written so far.

https://github.com/twitter/storehaus


There is a video under "June 18 | Seattle" on the linked page


Weird. I don't see it. Could be Ghostery or some similar ad blocking plugin I guess.


You don't need any add-ons to selectively block flash nowadays on Firefox... Menu - Add-ons - Plugins - Flash "Ask to activate" instead of "Always activate"


Yeah I have "click to play plugins" enabled on chrome and didn't see it until I saw these comments, and manually enabled plugins.


If you're interested in push notifications based on analytics data, check out Pushpop [1] (doesn't explicitly support iOS yet)

https://github.com/keenlabs/pushpop


I actually hadn't been using Cover, but I'm a big fan of Aviate. I greatly prefer it over the stock/Samsung experience.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: