Progress in CW 28/29 — 14. Juli 2015

Progress in CW 28/29

In the last week we have worked hardly at the issues which came up at the evaluation date.
At this date we have seen that the project needs some improvements of robustness. The main issue was that the elements in the virtual media shelf has been accidentally accessed. Therefore the sensitivity of the right hand has been lowered. To avoid confusion of the user, the shelf will be centered now if an element will be entered. Another point was the media type identification of single element. To make it clear whether the object is an audio, video or image file an water mark has been introduced.

An issue we couldn’t solve yet is the problem that video will be played in the background but the image of the video will not be displayed on the screen.

Weekly progress CW27/28 — 7. Juli 2015

Weekly progress CW27/28

Done last week:
– Movements has been smoothed and accurated
– Items are accessable by pushing the right hand towards the camera (more intuitive)
– Only one person is tracked in time now
– Tried to fine tune some parameters for better experience
– Added various welcome messages depending on the current time

Swapped because of lack of time:
– Introduction of a cursor and therefore the button overlay
– Speech recognition
– Zoom into the shelf

Current progress — 9. Juni 2015

Current progress

For a better comparability and overview the following posts will always have the same structure, namely ‚Storage handling, ‚Graphical presentation‘, ‚Gesture recognition‘, ‚Speech recognition and synthesis‘.


Storage handling
[Responsibility: Katharina]

The intended folder structure for the virtual media library (VML) should be as follows:

  • program root folder
    • images
    • music
    • video

The declaration for the root folder path will in the first step hardly defined by use. We also thought about letting the user choose a path by himself at the first time he starts the VML. However, this feature will only be implemented if there is time in the end. Because the VML should be a proof of concept we starting with content consisting of two files for each category of media. Initially the complete content will be displayed. For this we also already had an extension. The idea was to group the content by category and let the user choose whether he wants to see the whole content or filtered by category.


Graphical presentation
[Responsibility: Zhe]

The VML consists of a matrix of six columns by five rows in the first step. The ordering of the items at the several rows is still in discussion. Since the VML should also present a zoom functionality to the user, the idea was to reorder the items dynamically if the the zoom factor has been changed.

The items displayed will have fixed dimension which will be a quadratic rectangle presenting a cover if there is one, otherwise a color which will be determined randomly or depending on the category of the item. Because it would be hard to identify an item if there is no cover presented, there will also the name of the file be displayed under the cover.


Gesture recognition
[Responsibility: Rogeria]

To navigate through the VML the Kinect camera should be used. It scans the movements of the arms/hands of the user to interact. To do so the user has to put up either the left or right hand and move it in a direction he wants the VML to move. We still discussing whether the user has the chance to move the VML just in one direction at time or to let him move also in the second direction. Before we can make a decision we have to figure out how practical it is to allow multiple dimensions of movements. The first case will be a simple quad which moves together with a hand gesture of the user.


Speech recognition and synthesis
[Responsibility: Marcus]

To get an interactive system we want to implement some speech recognition and synthesis to ‚communicate‘ with the user. The speech recognition can be used to either change the category displayed, if this feature will be implemented, or to interact with an item in the VML. For example the user can say an identification number or the name of an item to access it. The audio output will inform the user about the current status, warnings or errors. The first intention for a speech recognition frameworks was to use the one developed by Google. Unfortunately we have seen that the Google framework only offers 50 requests per day for free. If you want to make more requests you have to pay them. Therefore we have to decide whether we should use the framework or not and what is an alternative.


In the current state we are very busy to make plans about how to work with the single topics and how to connect them efficiently. We try to get in touch with the individual parts and figure out quickly what are the best next steps.

Optical illusion: ‚Rotating Snakes‘ — 6. Mai 2015

Optical illusion: ‚Rotating Snakes‘

Optical illusion from

The optical illusion above is a so called ‚Rotating Snakes‘ illusion and belongs to the class of ‘Peripheral Drift’ illusions. These kind of illusions is characterized in the way that motion signals can only be recognized in the periphery of the focused visual field. Responsible for this motion signals in this case is the order of the used colors. The effect would also show up in a gray-scaled pattern. However, critical for the motional effect is the luminance and the contrast of the used colors in the pattern.

The whole image consists of concentric ordered blocks which build up circular rings. Each of these rings consists of repeated constructs of four colored elements which have the following order

Black -> Blue -> White -> Yellow

The observed rotational movements in this statically image can be explained by the mechanisms of the human eye. First of all the motional effect is essentially influenced by the so called saccades. These are fast parallel movements of the eye which are used to directing the eye towards an object or scanning the environment. The rate is two to three movement per second. The influence can be observed by an subject by fixating a specific point and afterwards look a bit around on the image. These tiny movements changing the image displayed on the eye’s retina. This on the other hand stimulates the neurons in the single layers of the retina in frequently different ways.

To get a better understanding of what is happening while this illusory effect we have to dive into the structure of the retina. As explained at the beginning the effect relies on the luminance and also the contrast of the blocks in the concentric circles.

Every time the eye is looking onto a specific point on an image or a landscape, a reflection of the scene is projected onto the retina. This projected image will be processed by the neurons of the retina layers (rods, cones, bipolar cells, etc.). If the scene changes, e.g. by a happened saccade, the informations on the retina will be changed and the affected neurons will send rapidly the new informations. After this is happened the signaling slows down until the next change is occurred. This decrease of signaling is called ‚adaption‘ and results in a more efficient way of processing because unnecessary informations won’t be send by the neurons highly frequent.

The interesting part for this explanation is the difference in how fast the various contrasts ‚adapt‘. Higher contrast also results in higher neural activity whereas moderate contrast just results in moderate activity. The rational changes of neural signaling will be detected by motion mechanisms which causing the illusion of a movement in the peripheral view.