Monday, November 12, 2007

Draft Subversion email

Decisions:

o Keep the repository (repo) on www or on another server?

++ I vote for "on www." What Randall recommended: make the live htdocs/* folders a checked-out instance of the repo, so that we can make and check-in tiny changes right in those folders

o Repo storage:
Berkeley DB-based
FSFS-based

++ I vote "FSFS." (This is now the default.) Berkeley DB is a legacy holdover, and has corruption vulnerabilities.

o Remote access:
Since we have already set up ssh accounts, it's simplest/safest to use svn+ssh: the svn client opens a ssh session, which then spawns an svn process *as the ssh user* (This means that 1. repository must be owned by a group the user is a member of, 2. the repository must be group-writable, 3. the user's PATH must include the path_to_svn_libraries (/usr/bin/ on www).) I don't really understand this: there may also be umask issues where svnadmin? commands run by a user with the wrong umask set may not be group-writable.
via svnserve daemon (problem: passwords are stored as cleartext.)
via ssh tunnelling
via Apache/webdav
* regular system users using a Subversion client (as themselves) to access the repository directly via file:// URLs;
* regular system users connecting to SSH-spawned private svnserve processes (running as themselves) which access the repository;
* an svnserve process - either a daemon or one launched by inetd - running as a particular fixed user;
* an Apache httpd process, running as a particular fixed user.

++ I vote Apache/webdav: (right? I think that Windows users can just open a webdav resource in a Windows Explorer window.)


o what user/group will own the repository? web:web? do we want to create user:group svn:svn?

What data do you expect to live in your repository (or repositories), and how will that data be organized?
text files, binaries (pdf, jpeg, gif, wmv)
/export/www/* (one project root or two?)
o single repository for multiple projects, or to give each project its own repository?
/export/www/
clearinghouseAdmin/
htdocs/
htdocs-cp2info/
mediaLibrary/
phplib/
railsDevelopment/
ttplib/

? ? Maybe:
repository : Path
www : /export/www/htdocs (pages on www)
www-lib : /export/www/ (ttplib, phplib, admin tools etc)
++ actually, I vote for all in one: they are all related, and we should be able to to ask about (or modify, or migrate elsewhere) the entire history of a single project

Where will your repository live
www server
and how will it be accessed?
directly (command line), network server (WebDav?)
repository browsing interfaces

e-mail commit notification

data backup strategy
backed up with www backups? Right?
What types of access control and repository event reporting do you need?

Which of the available types of data store do you want to use?
FSFS

Labels: , ,

Friday, November 09, 2007

Automating CD lookups

I want to look up info on my out-of-control CD collection: UPC, Title, Artist, ASIN, quantities and prices at Amazon, quantities at SwapaCD and/or lala.com.

What I have: a spreadsheet with titles/artists, some UPCs, some ASINs.

So I think I need a handful of scripts:

Look up UPC (given title / artist)

Look up ASIN (given title / artist)

Look up quantities and prices at Amazon (given ASIN)

I also need a data model (which Kernigan & Ritchie say is more fundamental than code).