193 lines
9.5 KiB
Org Mode
193 lines
9.5 KiB
Org Mode
#+latex_header: \documentclass[12pt]{article}
|
|
#+latex_header: \usepackage[margin=1in]{geometry}
|
|
#+OPTIONS: ^:nil
|
|
|
|
GNU MediaGoblin
|
|
|
|
* About
|
|
|
|
What is MediaGoblin? I'm shooting for:
|
|
|
|
- Initially, a place to store all your photos that's as awesome as,
|
|
more awesome than, existing proprietary solutions
|
|
- Later, a place for all sorts of media, such as video, music, etc
|
|
hosting.
|
|
- Federated, like statusnet/ostatus (we should use ostatus, in fact!)
|
|
- Customizable
|
|
- A place for people to collaborate and show off original and derived
|
|
creations
|
|
- Free, as in freedom. Under the GNU AGPL, v3 or later. Encourages
|
|
free formats and free licensing for content, too.
|
|
|
|
Wow! That's pretty ambitious. Hopefully we're cool enough to do it.
|
|
I think we can.
|
|
|
|
It's also necessary, for multiple reasons. Centralization and
|
|
proprietization of media on the internet is a serious problem and
|
|
makes the web go from a system of extreme resilience to a system
|
|
of frightening fragility. People should be able to own their data.
|
|
Etc. If you're reading this, chances are you already agree though. :)
|
|
|
|
* Milestones
|
|
|
|
Excepting the first, not necessarily in this order.
|
|
|
|
** Basic image hosting
|
|
** Multi-media hosting (including video and audio)
|
|
** API(s)
|
|
** Federation
|
|
|
|
Maybe this is 0.2 :)
|
|
|
|
** Plugin system
|
|
|
|
* Technology
|
|
|
|
I have a pretty specific set of tools that I expect to use in this
|
|
project. Those are:
|
|
|
|
- *[[http://python.org/][Python]]:* because I love, and know well, the language
|
|
- *[[http://www.mongodb.org/][MongoDB]]:* a "document database". Because it's extremely flexible
|
|
(and scales up well, but I guess not down well)
|
|
- *[[http://namlook.github.com/mongokit/][MongoKit]]:* a lightweight ORM for mongodb. Helps us define our
|
|
structures better, does schema validation, schema evolution, and
|
|
helps make things more fun and pythonic.
|
|
- *[[http://jinja.pocoo.org/docs/][Jinja2]]:* for templating. Pretty much django templates++ (wow, I
|
|
can actually pass arguments into method calls instead of tediously
|
|
writing custom tags!)
|
|
- *[[http://wtforms.simplecodes.com/][WTForms]]:* for form handling, validation, abstraction. Almost just
|
|
like Django's templates,
|
|
- *[[http://pythonpaste.org/webob/][WebOb]]:* gives nice request/response objects (also somewhat djangoish)
|
|
- *[[http://pythonpaste.org/deploy/][Paste Deploy]] and [[http://pythonpaste.org/script/][Paste Script]]:* as the default way of configuring
|
|
and launching the application. Since MediaGoblin will be fairly
|
|
wsgi minimalist though, you can probably use other ways to launch
|
|
it, though this will be the default.
|
|
- *[[http://routes.groovie.org/][Routes]]:* for URL routing. It works well enough.
|
|
- *[[http://jquery.com/][JQuery]]:* for all sorts of things on the javascript end of things,
|
|
for all sorts of reasons.
|
|
- *[[http://beaker.groovie.org/][Beaker]]:* for sessions, because that seems like it's generally
|
|
considered the way to go I guess.
|
|
- *[[http://somethingaboutorange.com/mrl/projects/nose/1.0.0/][nose]]:* for unit tests, because it makes testing a bit nicer.
|
|
- *[[http://celeryproject.org/][Celery]]:* for task queueing (think resizing images, encoding
|
|
video) because some people like it, and even the people I know who
|
|
don't don't seem to know of anything better :)
|
|
- *[[http://www.rabbitmq.com/][RabbitMQ]]:* for sending tasks to celery, because I guess that's
|
|
what most people do. Might be optional, might also let people use
|
|
MongoDB for this if they want.
|
|
|
|
** Why python
|
|
|
|
Because I (Chris Webber) know Python, love Python, am capable of
|
|
actually making this thing happen in Python (I've worked on a lot of
|
|
large free software web applications before in Python, including
|
|
[[http://mirocommunity.org/][Miro Community]], the [[http://miroguide.org][Miro Guide]], a large portion of
|
|
[[http://creativecommons.org/][Creative Commons' site]], and a whole bunch of things while working at
|
|
[[http://www.imagescape.com/][Imaginary Landscape]]). I know Python, I can make this happen in
|
|
Python, me starting a project like this makes sense if it's done in
|
|
Python.
|
|
|
|
You might say that PHP is way more deployable, that rails has way more
|
|
cool developers riding around on fixie bikes, and all of those things
|
|
are true, but I know Python, like Python, and think that Python is
|
|
pretty great. I do think that deployment in Python is not as good as
|
|
with PHP, but I think the days of shared hosting are (thankfully)
|
|
coming to an end, and will probably be replaced by cheap virtual
|
|
machines spun up on the fly for people who want that sort of stuff,
|
|
and Python will be a huge part of that future, maybe even more than
|
|
PHP will. The deployment tools are getting better. Maybe we can use
|
|
something like Silver Lining. Maybe we can just distribute as .debs
|
|
or .rpms. We'll figure it out.
|
|
|
|
But if I'm starting this project, which I am, it's gonna be in Python.
|
|
|
|
** Why mongodb
|
|
|
|
In case you were wondering, I am not a NOSQL fanboy, I do not go
|
|
around telling people that MongoDB is web scale. Actually my choice
|
|
for MongoDB isn't scalability, though scaling up really nicely is a
|
|
pretty good feature and sets us up well in case large volume sites
|
|
eventually do use MediaGoblin. But there's another side of
|
|
scalability, and that's scaling down, which is important for
|
|
federation, maybe even more important than scaling up in an ideal
|
|
universe where everyone ran servers out of their own housing. As a
|
|
memory-mapped database, MongoDB is pretty hungry, so actually I spent
|
|
a lot of time debating whether the inability to scale down as nicely
|
|
as something like SQL has with sqlite meant that it was out.
|
|
|
|
But I decided in the end that I really want MongoDB, not for
|
|
scalability, but for flexibility. Schema evolution pains in SQL are
|
|
almost enough reason for me to want MongoDB, but not quite. The real
|
|
reason is because I want the ability to eventually handle multiple
|
|
media types through MediaGoblin, and also allow for plugins, without
|
|
the rigidity of tables making that difficult. In other words,
|
|
something like:
|
|
|
|
#+BEGIN_SRC javascript
|
|
{"title": "Me talking until you are bored",
|
|
"description": "blah blah blah",
|
|
"media_type": "audio",
|
|
"media_data": {
|
|
"length": "2:30",
|
|
"codec": "OGG Vorbis"},
|
|
"plugin_data": {
|
|
"licensing": {
|
|
"license": "http://creativecommons.org/licenses/by-sa/3.0/"}}}
|
|
#+END_SRC
|
|
|
|
Being able to just dump media-specific information in a media_data
|
|
hashtable is pretty great, and even better is having a plugin system
|
|
where you can just let plugins have their own entire key-value space
|
|
cleanly inside the document that doesn't interfere with anyone else's
|
|
stuff. If we were to let plugins to deposit their own information
|
|
inside the database, either we'd let plugins create their own tables
|
|
which makes SQL migrations even harder than they already are, or we'd
|
|
probably end up creating a table with a column for key, a column for
|
|
value, and a column for type in one huge table called "plugin_data" or
|
|
something similar. (Yo dawg, I heard you liked plugins, so I put a
|
|
database in your database so you can query while you query.) Gross.
|
|
|
|
I also don't want things to be too lose so that we forget or lose the
|
|
structure of things, and that's one reason why I want to use MongoKit,
|
|
because we can cleanly define a much structure as we want and verify
|
|
that documents match that structure generally without adding too much
|
|
bloat or overhead (mongokit is a pretty lightweight wrapper and
|
|
doesn't inject extra mongokit-specific stuff into the database, which
|
|
is nice and nicer than many other ORMs in that way).
|
|
|
|
** Why wsgi minimalism / Why not Django
|
|
|
|
If you notice in the technology list above, I list a lot of components
|
|
that are very [[http://www.djangoproject.com/][Django-like]], but not actually Django components. What
|
|
can I say, I really like a lot of the ideas in Django! Which leads to
|
|
the question: why not just use Django?
|
|
|
|
While I really like Django's ideas and a lot of its components, I also
|
|
feel that most of the best ideas in Django I want have been
|
|
implemented as good or even better outside of Django. I could just
|
|
use Django and replace the templating system with Jinja2, and the form
|
|
system with wtforms, and the database with MongoDB and MongoKit, but
|
|
at that point, how much of Django is really left?
|
|
|
|
I also am sometimes saddened and irritated by how coupled all of
|
|
Django's components are. Loosely coupled yes, but still coupled.
|
|
WSGI has done a good job of providing a base layer for running
|
|
applications on and [[http://pythonpaste.org/webob/do-it-yourself.html][if you know how to do it yourself]] it's not hard or
|
|
many lines of code at all to bind them together without any framework
|
|
at all (not even say [[http://pylonshq.com/][Pylons]], [[http://docs.pylonsproject.org/projects/pyramid/dev/][Pyramid]], or [[http://flask.pocoo.org/][Flask]] which I think are still
|
|
great projects, especially for people who want this sort of thing but
|
|
have no idea how to get started). And even at this already really
|
|
early stage of writing MediaGoblin, that glue work is mostly done.
|
|
|
|
Not to say I don't think Django isn't great for a lot of things. For
|
|
a lot of stuff, it's still the best, but not for MediaGoblin, I think.
|
|
|
|
One thing that Django does super well though is documentation. It
|
|
still has some faults, but even with those considered I can hardly
|
|
think of any other project in Python that has as nice of documentation
|
|
as Django. It may be worth
|
|
[[http://pycon.blip.tv/file/4881071/][learning some lessons on documentation from Django]], on that note.
|
|
|
|
I'd really like to have a good, thorough hacking-howto and
|
|
deployment-howto, especially in the former making some notes on how to
|
|
make it easier for Django hackers to get started.
|