Arc Forum | Can't say I've seen a sample program for Postgres, but I would expect it to be f...

Arc Forum

2 points by thaddeus 4375 days ago | link | parent

Can't say I've seen a sample program for Postgres, but I would expect it to be fairly straight forward to use any database supporting HTTP as a protocol. And, really, almost any DB would be an upgrade compared to storing each record in it's own flat file - correct?

MongoDB: http://docs.mongodb.org/ecosystem/tools/http-interfaces/

Datomic: http://blog.datomic.com/2012/09/rest-api.html

OrientDB: https://github.com/orientechnologies/orientdb/wiki/OrientDB-...

MySQL http://code.nytimes.com/projects/dbslayer

Riak: http://docs.basho.com/riak/latest/dev/references/http/

And so on.

Of course Anarki also has HTTP Get/Post functionality where Arc 3.1 does not:

https://github.com/arclanguage/anarki/blob/master/lib/http.a...

You would also need to parse the HTTP results.

If you're dead-set on Postgres you could do something like this: http://rny.io/nginx/postgresql/2013/07/26/simple-api-with-ng... Though this seems a little a hackish to me.

Personally I'd use something like FleetDB (http://blackstag.com/blog.posting?id=3) over TCP, before I'd go back to that 1 record = 1 file nonsense.

2 points by lark 4374 days ago | link

Thank you for looking up this work. A database makes a lot of things simpler, like indexes, and particularly joins. Thank you also for pointing out Anarki has HTTP Get/Post, I didn't know this.

Could I ask why you wouldn't go back to saving data in files?

It's a clean, fast solution if one doesn't need joins. Using SQL requires building up strings, or putting together an ORM. Better not go down that path if you don't need them.

https://news.ycombinator.com/item?id=6580834

-----

3 points by thaddeus 4374 days ago | link

> wouldn't go back to saving data in files.

It's not the file structure itself that's the issue, it's mechanisms built around accessing the file(s). For a SMALL HN clone it's probably OK, but if the requirements change even a little you're in for some trouble.

Food for thought: Once your app data can't fit into memory you're going to need to move to a real database or spend big bucks on more hardware. You could do what HN does and purge older data (reloading only when required), but still if your app design requires regular access/writes to older data and it doesn't all fit into memory then your hardware is going to start to thrash (HN had these type of problems from what I read). And with HN, since each record is stored in it's own file, it's going to be exponentially bad.

Also, it's pretty useful to have your dbs and applications running on different servers. Doing this provides other applications running elsewhere to access the data too. So once again it all boils down to your design and expectations, which is why I originally asked if you are just running a HN clone.

FYI I just posted a FleetDB v2 link on the Arc forum if you want to check that option out.

Added: Looks like there's another HTTP library https://github.com/arclanguage/anarki/blob/master/lib/web.ar..., not sure which one is better, but this looks newer.

-----

2 points by lark 4372 days ago | link

Thank you for FleetDB. The ability to execute transactions atomically is useful.

Access to older data is equally a problem for the filesystem and a database. Databases don't get away with it. Once the app data can't fit into memory databases thrash on disk too.

The only argument I see is that save-table saves an entire record in a single file. Wanting access to a specific field of that field means loading the entire record. A lot of databases function this way too, but in general they don't have to. They could bring in memory individual fields. You could claim that as the reason for filesystem access to be slow.

But even then, save-table could be modified to save each field in a separate file.

If the only thing missing from a filesystem is atomicity of multiple transactions, then I'd rather implement just that, in Arc, rather than write an entire database.

-----

2 points by thaddeus 4372 days ago | link

> Thank you for FleetDB.

No problem. Note though that FleetDB is all in memory too. The advantage is that you can:

  1. Put the database on its' own server with dedicated RAM. 
  2. Compose and re-use queries.
  3. Have index support, built in, without code writing.
  4. Have robust/high performance concurrency semantics, not just atomic file writes.

> Databases don't get away with it. Once the app data can't fit into memory databases thrash on disk too.

You're correct in that they too will do seeks to disk, but nowhere close to the same amount when compared to handling each record in it's own file. Just try loading a million files vs 1 file with a million records.

> But even then, save-table could be modified to save each field in a separate file.

Which will exponentially increase the number of disk seeks.

As I stated before it really depends upon your design. Ask yourself: Are you going to have say a million records where you have to run a query againts all records? Would indexes help with that query on that data? Do you really want to code each index into a table then change all your application code to use them? Or would you rather instruct your database to add an index and discover the existing queries already take advantage of them? Note that the last time I checked, HN off loaded that kind of heavy lifting to a third party service[1] that uses[2], you guessed it, a database!

I am not a database expert and I don't want to convince you to use tool that you don't need to use. I'm just saying I see enough advantages in using a database that I don't plan on using file writes.

[1] https://www.hnsearch.com [2] http://www.thriftdb.com/documentation

-----