--- Log opened Tue Mar 29 00:00:12 2011 13:21 < conseo> mcallan: recarding the ascii limitation and the replacement with XHTML-entities, this is not end user friendly... we need to get these caveats out soon... 13:23 < conseo> mcallan: esp. as you have mentioned is the path to your first position still not easy. so +1 for your statement that this is still too difficult, this needs to be fixed for the main page poll imo. 15:01 -!- MarkDilley_ is now known as MarkDilley 15:15 -!- MarkDilley_ is now known as MarkDilley 17:19 < mcallan> conseo: yes, i agree about the path. the ASCII is less critical, because it's mainly for admins. it doesn't apply to draft content, only semantic content. it's a SMW bug, so far as i know. 17:20 < conseo> ok, atm. we are english mostly, but this can hurt a lot if we want to reach elsewhere imo 17:20 < conseo> no need to fix it now though 17:21 < conseo> but the entry path needs to be straightforward 17:21 < mcallan> yes 17:46 < conseo> mcallan: puuh i have written a lenghty mail over hours now for the list. don't know what you think about it... 17:47 < conseo> but it would interest me (either by responding to the list or dropping a quick note here) 17:53 < mcallan> conseo: ok, i have to post some code first, and then i'll catch up with mail. 18:52 < mcallan> conseo: sorry c, i didn't read the 2nd line. wow, you posted your manifesto. ;-) 18:54 < conseo> sry, to not have finished the harvester yet, but i thought the mail is more important now 18:56 < conseo> mcallan: if i use the name in the javascript config as a key for the db to map diffmessages to their configured base-url on the web, changing the name in javascript config will render all table entries useless. 18:56 < conseo> i think i have to write a table for services which i try to update every time the harvester is invoked and join that with the message table 18:56 < conseo> otherwise changing the name will mess it up 18:57 < conseo> or chosing the same name or no name... 18:57 < mcallan> don't use a variable (location of log archive) as a key 18:57 < mcallan> use a constant 18:58 < conseo> then the name has to be fixed or i have to introduce a constant 18:59 < mcallan> yes, if you don't have a constant (not saying u don't, i don't know) then you must ask the admin to define one 18:59 < mcallan> is this for mailing lists? 19:14 < conseo> any harvester instance has its own config, so it has its own web-url. why do you ask? 19:15 < mcallan> i was going to suggest a constant 19:23 < conseo> ok 19:24 < mcallan> i still don't know if it's for mailing lists, or chat, or what. but c, maybe it would be better to post your DB schema to the list. 19:25 < mcallan> if you have the wrong schema, your code will be a mess. no harm asking for critique up front. 19:37 < conseo> i don't have a schema yet, i was just asking how i should map the message to the service base-url to combine the url for the client's diff feed 19:38 < conseo> it is only a table 19:41 < mcallan> i use schema in the sense of db/relational structure. if you need quick feedback (that's cool), paste here. otherwise to the list. you have a concrete schema/structure in mind, and it's better to make it explicit. 19:42 < mcallan> (so paste your column names) 19:54 -!- imladris is now known as clarknova 19:58 < conseo> i d | t i t l e | a u t h o r | a d d r e s s e e | u r l | p o l l | p o s _ u r l | s u m m a r y | d i f f _ u r l | s e n t _ t s | p a r s e d _ t s 19:59 < conseo> url is sth. like http://metagovernment.org/pipermail/start_metagovernment.org/2011-February/003639.html atm. 19:59 < mcallan> id | title | author | addressee | url | poll | pos_url | summary | diff_url | sent_ts | parsed_ts 20:00 < conseo> oh sorry messed up sed :-/ 20:00 < mcallan> id is? 20:00 < conseo> btw. sed 's/\n//g' does not work on the psql output... why? 20:01 < mcallan> (not sure, never use sed) 20:02 < conseo> id is a base64 encoding of msg.email + msg.body + msg.sentDate.toString() 20:02 < conseo> of the sha1 hash 20:02 < conseo> of that 20:03 < mcallan> so it is uid for message, ok 20:04 < conseo> pos_url is a link to the position page of the author of the message and sent_ts and parsed_ts are timestamps 20:05 < mcallan> full archive url cannot be encoded in db, because it is variable. that's the problem 20:06 < conseo> ok, but the path is fixed, so i can save it and join it with the base-url variable gained from the config in the servlet 20:06 < mcallan> right, and i guess that was your plan 20:07 < conseo> the only thing i need is a key for every config item to ensure that it stays the same 20:07 < conseo> yep 20:07 < mcallan> you somehow need to map msg (row in table) to config of base url 20:07 < conseo> i thought about storing an id to a service table instead and let the voharvester update that table on every run, but loading the config file might be more flexible 20:08 < mcallan> (yes, we talked about that much a couple days ago) 20:08 < conseo> sure 20:09 < conseo> i'll do it in the config with a key (maybe i'll make the name a key simply) and ping you when i am ready 20:09 < mcallan> question is, how exactly does base url change? 20:10 < mcallan> don't have to *do* unless u want to, u can bounce off me first. i don't mind critiquing design 20:11 < mcallan> try to forsee all possibilities of shifting/splitting base urls - all the real life mess of mailing lists 20:12 < mcallan> then base sol'n on what you forsee 20:12 < conseo> hmmm, don't know for other implementations (if they add stuff to the query we might set the base url with marker for replacement then 20:13 < mcallan> replacement in db? don't think it. ugh. 20:14 < conseo> no i mean we could set base url in the config with a marker to replace that marker with the string from the db, but forget what i said 20:15 < mcallan> oh i see, ok. anyway, we shldnt think of solutions till we state the problem 20:16 < mcallan> it's not just that config may change. that's part of the sol'n. what is the problem - why aren't we going to write urls to db, for e.g.? 20:17 < conseo> write which url to the db for what? 20:18 < mcallan> (i mean 'url', the url to the msg in the list archive) list all the ways that can go wrong. 1) the archive moves. 2) the mailing list is rehosted under a different type of list server. 3) the mailing list is forked. 4) the archive is forked, after a certain date. 20:18 < mcallan> id | title | author | addressee | url | poll | pos_url | summary | ... 20:19 < mcallan> conseo: then propose solution, and test it against continued sanity of the harvester admin! 20:20 < mcallan> (this isn't an easy piece of code) 20:21 < conseo> well it is not particularly interesting it just dumb work... man, it takes soo much time to get that thing finally running. i can see it and click it but it still takes months to get it to a deployable state... 20:21 < conseo> but u r right of course 20:21 < conseo> i am not complaining 20:22 < mcallan> it's almost there. and it's a crucial piece of code, so it's worth the wait. 20:23 < mcallan> ok, let me try propose something quick, that leaves room for correction later... 20:24 < conseo> 1) covered 2) might be a problem if the url cannot be split in two halfes (e.g. has several parameters) 20:25 < mcallan> 2) forget it. if the archive (in orginal form) is effectively deleted, and no just moved, then it's a goner! 20:25 < conseo> 4) is critical... but u can simply clone the config entry and change the key and the base url 20:26 < conseo> the old entries will still point to the old archive, while all newly indexed posts will be inserted with the new key in the db 20:27 < mcallan> yes, but that assumes precise timing of config change. don't save yet, no wait, now save! 20:28 < conseo> :-) 20:28 < conseo> you are a hard nut :-D 20:28 < mcallan> trouble is a stream of messages is flying in... 20:29 < conseo> and the point is passed with the old archive. then some entries in the db will point to the last post in thread of the old archive (atm) 20:30 < mcallan> one approach that i've used is to factor the whole problem out to runtime config 20:31 < mcallan> so you ask the config script, what is the base url for this message, please? 20:31 < conseo> does the JavaScriptIncluder gets updated on runtime when the file changes? 20:32 < mcallan> what i meant was that script is called for every opertaion baseURL + path. 20:32 < mcallan> you ask it what the baseURL is, before feeding the whole url to client 20:32 < conseo> uuh that means all bites pass that script... (don't know if this is a problem) 20:33 < mcallan> script looks at message, scratches chin, and decides what baseURL to tell u 20:33 < mcallan> scratches chin very fast 20:33 < mcallan> (hopefully) 20:33 < mcallan> right, it might not scale 20:34 < mcallan> feed must be very *very* fast 20:35 < mcallan> but something *like* that is the quick fix... 20:38 < mcallan> conseo: it's good enough. when speed becomes a problem, then we revist the solution. 20:39 < mcallan> meantime it is a correct solution, and it is simple. 20:39 < mcallan> you have seen examples of .js that are called at runtime in this manner (for vote count, trust extension, etc). 20:40 < mcallan> all you need to do is give the script suitable info (in function parameters) to decide the baseURL 20:41 < conseo> like key,path+query,parsed_ts? 20:41 < mcallan> and if u do it right, you can extend that info in future without breaking the script. (so it passes the admin sanity test) 20:42 < conseo> ok 20:42 < mcallan> so to start with, it won't matter *what* info you pass. as long as it works. 20:43 < mcallan> u pass it something like this: http://zelea.com/project/votorola/_/javadoc/votorola/a/trust/TrustExtensionContext.html 20:43 < mcallan> BaseURLChinScratchingContext 20:45 < mcallan> when it makes up its mind, it calls context.setFullURL(fullURL). So you give it complete control over the url. 20:49 < mcallan> conseo: the more i think about it, the more i think its scalable. it can be made to run very very fast. 20:51 < conseo> mcallan: ok. i'll have a look at it tomorrow 20:51 < conseo> i'll have to go to bed now 20:53 < conseo> mcallan: thx for your advice! 20:53 < conseo> n8 and happy hacking 20:56 < mcallan> ur welcome. n8 c, cu 2morrow 21:07 < mcallan> conseo: one final suggestion (for tomorrow). Instead of worrying about what list ID to give the script's urlSetter (so it knows what baseURL to add) do this: have the construction config tell *you* (make it a config item) the name of its urlSetter function. You just have to call the correct function at runtime, and it will give you the correct URL. That's also fast, cause the script does not have to waste its time trying to figure out what list the message came from, and then switching on that. 21:09 < mcallan> (so the admin may config it, up front, such that different lists have different urlSetter functions) --- Log closed Wed Mar 30 00:00:26 2011