Posted
over 16 years
ago
Technorati 标签: clam Video-- To view it on Youtube, click HERE (You may choose the HD mode to view a high quality version)-- To download this video, click HERE Screenshots Project-1: Web-based extractor Project-2: Clam Aggregator --Setting the Configuration --Editting the Multi-level Descriptors --Viewing the Combined Schema
|
Posted
over 16 years
ago
Technorati ??: clam
|
Posted
over 16 years
ago
No se cuán útil puede ser esto para alguien más, ya que de por sí es algo específico para programadores (que usen SVN) y además abre los archivos en cuestión con el gvim (que también requiere cierto conocimiento), así que a quienes puede servirle
... [More]
seguramente sabrán como hacer este script ellos mismos… Pero bueno, a mi me resultó útil y lo comparto:
gvim `svn stat -q | sed -e ’s/^M[ \t]*\(.*\)/\1/g’ | tr ‘\n’ ‘ ‘`
… y para hacer de este post algo un poco más útil para más gente, voy a explicar cómo es que funciona esa línea:
Para empezar gvim es un editor de texto, que agrega una interfaz gráfica al viejo y querido vim. Así que en principio, lo que se quiere es llamar al gvim seguido de la lista de archivos que se quieren abrir. Veamos ahora cómo se obtiene esa lista, gracias al poder de los pipes unix, sus poderosos comandos y el intérprete de comandos bash:
Empecemos por el bash. Además de permitir el uso de los pipelines (que ya pasaré a explicar), en bash lo que se escriba entre comillas invertidas (“) se ejecutará en otra instancia, y su salida será ejecutada literalmente. Pongamos un ejemplo simple: bash tiene, como tenía el DOS, un comando que se llama “echo“, que imprime lo que se le pase como parámetro. Es decir, echo “hola mundo” imprimirá un hola mundo en la pantalla. Ahora bien, si hacemos `echo “hola mundo”` bash nos dirá: bash: hola: command not found. ¿Por qué?, porque bash ha intentado ejecutar el comando hola con el parámetro mundo.
Pues bien, eso es lo que se hace en la línea de arriba: se le pasa a gvim la salida de los comandos que se ejecutan entre las comillas invertidas.
Pasemos a ver ahora esos comandos:
svn stat -q
SVN (subversion) es un sistema de control de versiones muy utilizado para desarrollar software en forma cooperativa (aunque puede servir para cualquier otro propósito en el que se necesite tener un historial de modificacion de archivos, tanto en forma cooperativa como individual). Quien esté acostumbrado a la edición en wikis, sabrá que uno puede volver fácilmente a cualquier versión previa (revisión) de cada artículo/página. SVN es algo similar: los archivos incluídos se mantienen en una base de datos en un servidor, y en cualquier momento se puede recuperar una revisión específica de cada archivo, hacer comparaciones entre una y otra revisión, etc…
Pues bien, el comando “svn stat -q” lo que hace es informar de los archivos modificados localmente respecto al servidor. Por ejemplo, en este momento tengo algunos cambios locales respecto al HEAD (última revisión en el servidor) de CLAM. Ejecutándo “svn stat -q” en el directorio de CLAM obtengo:
M NetworkEditor/src/MainWindow.hxx
M NetworkEditor/src/NetworkCanvas.hxx
M CLAM/src/Flow/Networks/FlattenedNetwork.cxx
M CLAM/src/Flow/Networks/FlattenedNetwork.hxx
M CLAM/src/Flow/Networks/BaseNetwork.hxx
M CLAM/src/Flow/Networks/BackEnds/JACKNetworkPlayer.cxx
En efecto, esos son los archivos que quiero editar. la M me indica que tienen modificaciones locales. Pero… cómo utilizar esa lista para pasársela a gvim?
Aquí es donde entran los pipelines, y el uso de los poderosos comandos unix.
¿Qué hace un pipeline? Permite que la salida de texto en la consola generado por un programa/comando sea pasado como entrada al siguiente. En este caso, la salida del “svn stat -q” (las líneas que se muestran arriba, por ejemplo) se pasan como entrada al comando sed, y luego la salida del sed se pasa como entrada al comando tr.
¿Qué hace sed? Permite buscar y reemplazar patrones de texto, con la versatilidad que permite el uso de las expresiones regulares[1]. Las expresiones regulares pueden ser muy complejas, y explicar su funcionamiento llevaría mucho tiempo. A quien le interese profundizar en el tema, le recomiendo mirar este sitio. Aquí baste con analizar cómo funciona la expresión que se utiliza en el script de arriba:
sed -e ’s/^M[ \t]*\(.*\)/\1/g’
“sed -e ” dice que lo siguiente será una expresión regular, que irá entre comillas simples (’ ‘), con la forma ’s/patronentrada/patronsalida/g’ reemplazará el patron de entrada por el patrón de salida. Si quisiéramos cambiar todas las letras “a” por letras “e”, podríamos poner entre las comillas: ’s/a/e/g’ (anagrama de la innombrable :-/). En esta caso, el patrón de entrada es ‘^M[ \t]*\(.*\)’
Paso a explicar:
^ significa comienzo de línea.
M es la M de los archivos modificados (que va luego del comienzo de línea en la salida del svn stat -q).
[ \t]*: lo que va entre corchetes (en este caso un espacio y \t, que es equivalente a “tab”) son una “clase” de caracteres. El * significa que puede haber la cantidad que sea (0 o más, los que encuentre) de esos caracteres. Por lo tanto, [ \t]* busca espacios y/o tabs, si es que hay.
\(.*\): el . es un comodin, es decir cualquier caracter. .* busca el número que sea (0 o más) de cualquier caracter. Los paréntesis (que llevan un \ delante sólo para que bash no lo tome para sí….) lo que hacen es agrupar el contenido que llevan dentro.
\1 en el patrón de salida lo que hace es escribir el primer grupo de los hechos por los paréntesis en el patrón de entrada.
Resumiendo: ’s/^M[ \t]*\(.*\)/\1/g’ lo que hace es: buscar todas aquellas líneas que comiencen con una M, y luego tengan 0 o más espacios o tabs, seguidos de 0 o más caracteres, agrupar a estos últimos, y reemplazar toda esa línea por el grupo de los últimos caracteres.
Veamos entonces qué es lo que devuelve svn stat -q | sed -e ’s/^M[ \t]*\(.*\)/\1/g’:
NetworkEditor/src/MainWindow.hxx
NetworkEditor/src/NetworkCanvas.hxx
CLAM/src/Flow/Networks/FlattenedNetwork.cxx
CLAM/src/Flow/Networks/FlattenedNetwork.hxx
CLAM/src/Flow/Networks/BaseNetwork.hxx
CLAM/src/Flow/Networks/BackEnds/JACKNetworkPlayer.cxx
Ya hemos obtenido los nombres de los archivos que queríamos!!!
¿Qué hace entonces el último pipe (tr)?
El comando anterior ha devuelto la lista de archivos, pero uno por línea, mientras que el gvim necesita que se le pasen en una misma línea. Como los patrones de expresión regulares de sed se aplican sólo por línea, necesitamos luego remover los saltos de línea con algo más. Esa es la función de “tr ‘\n’ ‘ ‘”: que reemplace los ‘\n’ (saltos de línea) por ‘ ‘ (espacios).
Luego, svn stat -q | sed -e ’s/^M[ \t]*\(.*\)/\1/g’ | tr ‘\n’ ‘ ‘ devuelve:
NetworkEditor/src/MainWindow.hxx NetworkEditor/src/NetworkCanvas.hxx CLAM/src/Flow/Networks/FlattenedNetwork.cxx CLAM/src/Flow/Networks/FlattenedNetwork.hxx CLAM/src/Flow/Networks/BaseNetwork.hxx CLAM/src/Flow/Networks/BackEnds/JACKNetworkPlayer.cxx
Como eso está entre las comillas invertidas en el comando original, es lo que se le pasa al gvim como parámetros, haciendo que se abran esos archivos.
No está bueno Linux? ;-D
[1] Me resultó cómica la definición de las expresiones regulares del sitio citado: You can think of regular expressions as wildcards on steroids. [Less]
|
Posted
over 16 years
ago
Este es el primer screencast que hago acerca de mi proyecto para el Google Summer Of Code ‘08.
Se trata de una red simple en la cual cargo un MIDI Source (Processing que crea un puerto MIDI de entrada), un MIDINote2Freq (que transforma el mensaje
... [More]
MIDI Note en un par de numeros reales que indican la frecuencia de la nota y la amplitud), un oscilador sinusoidal simple, y un Audio Sink (que crea un puerto de salida de audio). Además agrego un osciloscopio para tener feedback visual.
Click here to view the embedded video. [Less]
|
Posted
over 16 years
ago
Ya salió la versión completa de “Yo Frankie“!!Descargas y links a videos/tutoriales sobre el gameengine de Blender: http://www.yofrankie.org/download/
|
Now that the CLAM NetworkEditor has become such a convenient tool to define audio processing systems, we started to use it massively instead of C code to integrate processings. Even thought, we started to find that whenever we change a
... [More]
configuration structure, a port/control name, or even a processing class name, the networks xml had to be edited by hand because they didn't load. That problem led us to avoid such kind of changes, which is not a sane option. 'Embrace change', agile gurus said, and so we did. We have developed a python tool to support network refactoring. It can be used, both as a python module, or as a command line tool, to batch modify CLAM network files using high level operations such as: Renaming a processing class name Renaming a processing name Renaming processing ports or controls Renaming a processing configuration parameter Removing/adding/reorder configuration parameters Setting configuration parameters values The script just does XPath navigation and DOM manipulations in order to know which pieces need to be changed. Each high level command is just a couple of lines in python. We are starting to use it to: Adapt to changes in the C code Change network configuration parameters in batch Provide migration scripts for user's networks About the last point, we plan the next release to provide network migration scripts containing a command set such as: ensureVersion 1.3renameClass OutControlSender NewClassNamerenameConnector AudioMixer inport "Input 0" "New Port Name 0"renameConfig AudioMixer NumberOfInPorts NInputssetConfigByType AudioMixer NInputs 14upgrade 1.3.2 Still some short term TODO's: Include the clam version in the network xml so that the ensureVersion and upgrade commands work. Integrating also Qt Designer UI files in the refactorings Add some other commands as they are needed Happy CLAM networks refactoring! [Less]
|
A couple of weeks ago we had our Telefonica Open Research Day. Finally I found some time to blog about it.It was a half-day event in which we had both talks and demos showcasing the latest developments in Telefonica's Scientific teams.In the talks we
... [More]
had several invited speakers: Mateo Valero, Head of the Computer Architecture Department at UPC and Director of the Barcelona Supercomputing Center gave a talk on the "Future of Supercomputers"; Sandeep K. Singhal, Product Manager of the Windows Network Team gave a talk on the "Challenges of Networking in the 21st Century"; Federico Casalegno, head of the Design Lab at the Massachusetts Institute of Technology talked about their projects related to social mobile and information sharing; and Arturo Azcorra, Universidad Carlos III and IMDEA Networks talked about Internet 2.Then we had our Multimedia Scientific Director, Nuria Oliver, talk about the challenges related to the explosion of content, seamless connectivity, and decreasing attention span from users. And Pablo Rodriguez, our Internet Scientific Director, talked about using the network as a FedEx service to ship bulk data from one point in the globe ot another.I took part in a panel entitled "Search, Recommendations, and Personalization: Text and Beyond" where we also had Hugo Zaragoza from Yahoo Research, Xavier Serra from the Music Technology Group at UPF, Ferran Marques from UPC, Marc Torrens from Strands, and Alejandro Jaimes, head of the Telefonica Scientific group on Datamining and User Profiling. Xavier Serra and Ferran talked about ways to bridge the semantic gap in multimedia - in the case of music and images, respectively. Hugo talked about the challenges in Search and Marc on the power of Recommendation. Alejandro talked about how culture should be taken into account in applications and algorithms that deal with people. I gave a talk on how Recommender Systems can become an alternative to Search engines (you can find my slides here).You can find all the slides for the talks here.Then we had a bunch of interesting demos from many of our research projects. Josep M. Pujol and his Social Search Engine prototype; Joachim Newman and his project analyzing Bicing users behavior; Xavier Anguera presenting our project on multimodal interfaces for picture browsing on the cell; the This or That project on social interaction over the cellphone and Facebook for shopping in conjunction with MIT; Xiaoyuan Yang's Kangaroo P2P solution for video broadcasting; several projects on wireless, network...Overall a quite successful event with over 120 people from different backgrounds (universities, industry...) completing a full house.Let me know if you need further information on any project or you would like to be included in next year's list of guests. [Less]
|
So long since last post, and a lot of things to explain (GSoC results, GSoC Mentor Submit, QtDevDays, CLAM network refactoring script, typed controls...). But, let's start explaining some work we did on a Back-to-back system we recently deployed for
... [More]
CLAM and our 3D acoustic project in Barcelona Media. Back-to-back testing background You, extreme programmer, might want to have unit tests (white box testing) for every single line of code you write. But, sometimes, this is a hard thing to achieve. For example, canonical test cases for audio processing algorithms that exercise a single piece of code are very hard to find. You might also want to take control of a piece of untested code in order to refactor it without introducing new bugs. In all those cases back-to-back tests are your most powerful tool. Back-to-back tests (B2B) are black box tests that compare the output of a reference version of an algorithm with the output of an evolved version, given the same set of inputs. When a back-to-back test fails, it means that something changed but normally it doesn't give you any more information than that. If the change was expected to alter the output, you must revalidate the new output again and make it the new reference. But if the alteration was not expected, you should either roll-back the change or fix it. In back-to-back tests there is no truth to be asserted. You just rely on the fact that the last version was OK. If b2b tests get red because an expected change of behaviour but you don't validate the new results, you will loose any control on posterior changes. So, is very important to keep them green or validating any new correct result. Because of that, B2B tests are very helpful to be used in combination of a continuous integration system such as TestFarm, that can point you to the guilty commit even if further commits have been done. CLAM's OfflinePlayer is very convenient to do back2back testing of CLAM networks. It runs them off-line specifying some input and outputs wave files. Automate the check by subtracting the output with a reference file and checking the level against a threshold, and you have a back-to-back test. But still maintaining the outputs up-to-date is hard. So, we have developed a python module named audiob2b.py that makes defining and maintaining b2b test on audio very easy. Defining a b2b test suite A test suite is defined by defining back-to-back data path, and a list of test cases, each one defining a name, a command line and a set of outputs to be checked:#!/usr/bin/env python# back2back.py from audiob2b import runBack2BackProgram data_path="b2b/mysuite" back2BackTests = [ ("testCase1", "OfflinePlayer mynetwork.clamnetwork b2b/mysuite/inputs/input1.wav -o output1.wav output2.wav" , [ "output1.wav", "output2.wav", ]), # any other testcases there ] runBack2BackProgram(data_path, sys.argv, back2BackTests)Notice that this example uses OfflinePlayer but, as you write the full command line, you are not just limited to that. Indeed for 3D acoustics algorithms we are testing other programs that also generate wave files. Back-to-back work flow When you run the test suite the first time (./back2back.py without parameters) there is no reference files (expectation) and you will get a red. Current outputs will be copied into the data path like that: b2b/mysuite/testCase1_output1_result.wav b2b/mysuite/testCase1_output2_result.wav ... After validating that the outputs are OK, you can accept a test case by issuing: $ ./back2back.py --validate testCase1 The files will be moved as: b2b/mysuite/testCase1_output1_expected.wavb2b/mysuite/testCase1_output2_expected.wav... And the next time you run the tests, they will be green. At this point you can add and commit the 'expected' files on the data repository. Whenever the output is altered in a sensible way and you get a red, you will have again the '_result' files and also some '_diff' files so that you can easily check the difference. All those files will be cleaned as soon you validate them or you get back the old results. So the main benefit of that is that the expectation files management is almost automated so it is easier to maintain them in green. Supporting architecture differences Often the same algorithm provides slightly different values depending on the architecture you are running on, mostly because different precision (ie. 32 vs. 64 bits) or different implementations of the floating point functions. Having back-to-back tests changing all the time depending on which platform you run them is not something desirable. The audiob2b module generate platform dependant expectations by validating them with the --arch flag. Platform dependant expectations are used instead the regular ones just if the ones for the current platform are found. Future The near future of the tool is just being used. We should extend the set of controlled networks and processing modules in CLAM. So I would like to invite other CLAM contributors to add more back2back's. Place your suite data in 'clam-test-data/b2b/'. We should decide where the suite definitions themselves should be placed. Maybe somewhere in CLAM/test but it won't be fair because dependencies on NetworkEditor and maybe in plugins. Also a feature that would extend the kind of code we control with back-to-back, would be supporting file types other than wave files such as plain text files, or XML files (some kind smarter than just plain text). Any ideas? Comments? [Less]
|
My next stop was in Vancouver for the ACM Multimedia conference.Vancouver is an awesome city and I was fortunate to have two friends (Alberto and Juan) living there. So I was able to attend the conference, have fun, and eat wonderful sushi, all at
... [More]
the same time.The first two days I was invited to the ACM SIGMM retreat. Around 35 top researchers from around the world gathered to discuss on the future of Multimedia. We discussed issues related to research, education, and industry relations. Overall this was probably the best part of the conference as I got to meet and talk at length to very interesting people. Socializing in such a setting was much easier than in the typical, and overcrowded, coffee breaks. (In the picture Nicolas Georganas and Wolfgang Effelsberg, chairs of one of the breakout sessions).Outside of the confernce, I also got to meet Tim Bray, currently director of Web Technologies at SUN and well-known for being one of the fathers of technologies such as XML and RDF. I admire him for this and for his views on Agile development and Open Source. I was delighted to hear from him that he is completely in favor of REST and does not believe in the Semantic Web.The conference itself was pretty good although I have to admit that I got much more out of demos and posters than out of regular paper presentations. To be honest, I have come to think that most presentations in conferences are a loss of time. Unless you have a good presenter (and that happens around 10% of the time) you are better off reading the article. Demos and posters, however, are different as they offer a one-to-one interaction with authors.The Open Source prize was sponsored by us (Telefonica Research) and the winner was the Network-Integrated Multimedia Middelware, a pretty amazing piece of sofware for addressing multimedia devices over the network. It is great to see such awesome projects getting the price we won in 2006 for CLAM. [Less]
|
Last week I attended the 2nd ACM Conference on Recommender Systems in Lausanne. Regardless of being just in its second edition the Recsys conference is already showing signs of becoming a top tier conference very soon. 121 submissions (a 100%
... [More]
increase over the first edition) and a 31% acceptance rate (that is including short papers and posters) are indeed very promising figures. Of course this brings together an overall increase in quality in all accepted papers.Another highlight of the conference were the three very interesting tutorials on the first day. Robin Burke talked about robustness in Recommender Systems. Yehuda Koren, of Netflix Prize world fame, gave an interesting tutorial on the approach they are using for staying at the top of the Netflix Challenge leader board. Yehuda has now left AT&T and joined Yahoo Research in Israel so it is unclear how much he will continue working on the prize from now on. Finally, Gediminas Adomavicius talked about context-aware Recommender Systems in another very interesting tutorial.One of the interesting differences in Recsys in relation to other conferences in related fields is the high percentage of industry participants (around 50%). The guys from Strands have been doing an amazing job of making sure this does not become a purely academic conference. This year they even offered a $100,000 prize for the best start-up idea related to Recommender Systems. The winners, also co-leaders of the Netflix prize, presented an IPTV recommender, which reminded me a lot of some of the work we are doing in Telefonica R&D. You can read more about the prize in Strands' blog.At the end of the conference we made a bid to bring Recsys to Barcelona in 2010 (next year is in New York). If we get selected I will be co-chairing the conference with Francisco Martin, Strands' CEO. Stay tuned for more info on how this develops. [Less]
|