Interesting: The State of the News Media 2008

April 16th, 2008 Irfan Essa Posted in Computational Journalism, Interesting | No Comments »

 The Project for Excellence in Journalism has done an amazing report on “The State of the News Media 2008″
with reference to American Journalism.  It is a very interesting read and very interesting with reference to our efforts on Computation and Journalism.

AddThis Social Bookmark Button

Paper: ICASSP (2008) “Discriminative Feature Selection for Hidden Markov Models using Segmental Boosting”

April 3rd, 2008 Irfan Essa Posted in Face and Gesture, James Rehg, Numerical Machine Learning, PAMI/ICCV/CVPR/ECCV, Papers, Pei Yin, Thad Starner | No Comments »

Pei Yin, Irfan Essa, James Rehg, Thad Starner (2008) “Discriminative Feature Selection for Hidden Markov Models using Segmental Boosting”, ICASSP 2008 - March 30 - April 4, 2008 - Las Vegas, Nevada, U.S.A. (Paper: MLSP-P3.D8, Session: Pattern Recognition and Classification II, Time: Thursday, April 3, 15:30 - 17:30, Topic: Machine Learning for Signal Processing: Learning Theory and Modeling) (PDF|Project Site)

ABSTRACT

icassp08We address the feature selection problem for hidden Markov models (HMMs) in sequence classification. Temporal correlation in sequences often causes difficulty in applying feature selection techniques. Inspired by segmental k-means segmentation (SKS), we propose Segmentally Boosted HMMs (SBHMMs), where the state-optimized features are constructed in a segmental and discriminative manner. The contributions are twofold. First, we introduce a novel feature selection algorithm, where the temporal dynamics are decoupled from the static learning procedure by assuming that the sequential data are piecewise independent and identically distributed. Second, we show that the SBHMM consistently improves traditional HMM recognition in various domains. The reduction of error compared to traditional HMMs ranges from 17% to 70% in American Sign Language recognition, human gait identification, lip reading, and speech recognition.

AddThis Social Bookmark Button

Event: SIGGRAPH PC Meeting at GA Tech

March 30th, 2008 Irfan Essa Posted in Events, Greg Turk, SIGGRAPH/SCA/NPAR/EG | No Comments »

ACM SIGGRAPH 2008 Paper’s Committee Meeting was held at GA Tech in Atlanta, March 29-30, under the leadership of Greg Turk. Following is a picture of all of us at work, with our sigs, as a note of thanks for Greg

20080330-at-17h07m27-mg-9450bw.jpg

Original Photo by myself, this version with sigs by Fredo Durand.

AddThis Social Bookmark Button

Funding: NSF (2008) “Symposium on Computation and Journalism”

March 8th, 2008 Irfan Essa Posted in Computational Journalism, Funding | No Comments »

Award#0813831 - Symposium on Computation and Journalism

ABSTRACT:

Fundamentally, journalism is aimed at collecting news information and disseminating that information with a layer of contextualization and understanding provided by journalists. Recent advances in computational technology are rapidly affecting how news information is gathered, reported and distributed. Furthermore, new avenues for aggregating, visualizing, summarizing, consuming, and collaborating on news are increasingly becoming popular and challenging traditional practices of Journalism. Following the success of text search, image and video search questions are now poised to make a bigger impact to journalism and other related fields. Computation and Journalism individually share a deep routed interest in Information, and the value it provides to society. The concept of Information Quality, the measure of the value that the information provides to the user of that information, brings these two disciplines together. In computing and information sciences, information quality is used to describe the degree of excellence in communicating knowledge or intelligence and is composed of different facets such as accuracy, reliability, comprehensiveness, currency, and validity. In journalism, where the conveyance of quality information is paramount, principles such as accuracy, fairness, thoroughness, and transparency guide journalists in communicating quality information. Traditionally, journalism has also entailed an ethos of working on the side of the citizenry to provide them with quality information they need to make informed decisions in the process of their daily lives. However, the plethora of un-vetted blogs, podcasts, videos and other online media, generated by users or by corporations with subjective biases have led to significant compromise in information quality. Collaborative knowledge generation (wikipedia), and citizen journalism, are showing new ways of how information and (global) news can be shared. However, as the Web and the Internet continue to grow and as computing technologies pervade through the planet, a thorough study of the process of journalism and the deep computational aspects of such processes need to be undertaken. To this end, the PI’s research group at Georgia Institute of Technology is interested in understanding how computational advances impact the field of journalism. The long term aim is to make novel contributions by developing computational technologies to better support the goals of journalism. To launch this effort, they are organizing a Symposium on Computation + Journalism at GA Tech, in Atlanta, GA, February 22-23, 2008. The goal of this symposium is to bring together stakeholder from the all aspects of Journalism, Media, and Computation. Participants in panels, presentations and breakout groups will discuss these issues and create a roadmap towards answering these questions that bring together computation and journalism.

AddThis Social Bookmark Button

Personal: Creative Use of Computational Photography and Journalism

March 3rd, 2008 Irfan Essa Posted in Personal | No Comments »

Irfan’s Office Hacked

My students decided to play a very nice joke on me. This morning I walked in to find my office open (and it was not!)

Office Open Office Really Open

check out the left image, with door closed and the right image with the door open.

And, then the inside of the office was kinda different too.

Inside of the Office

AddThis Social Bookmark Button

Event: Journalism 3G The Future of Technology in the Field

February 23rd, 2008 Irfan Essa Posted in Computational Journalism, Events, Nick Diakopoulos | No Comments »

Journalism 3G: The Future of Technology in the Field (A Symposium on Computation and Journalism) was a huge success. CJ Logo

  • We had over 230 registered attendees. Thanks to all participants, panelists, and speakers.
  • Use our Social Network (http://cj.crowdvine.com/) to continue the conversation.
  • Join the FACEBOOK group (http://git.facebook.com/group.php?gid=18427444784)
  • Use the tag “CnJ” on all blog posts and photo/video posts on the web, so we can collect them
  • Videos of the event are now available here.

20080223_0351-0355-pano-200p.jpg

AddThis Social Bookmark Button

Event: Symposium on computation+journalism (Feb 22-23, 2008, Atlanta, GA)

February 15th, 2008 Irfan Essa Posted in Events, Nick Diakopoulos | No Comments »

CJ LogoWorking with Brad Stenger (Wired), Nick Diakopoulos (GA Tech), Sergio Goldenberg (GA Tech), we are organizing a Symposium on computation+journalism, to bring together computationalists, internet/media experts, and journalists together for a series of panels, presentations, and discussion around how computing technologies are effecting (and changing) journalism practices. We have over 180 people registered and it promise to be a great first-of-its-kind event. This event is being hosted by the GVU Center at Georgia Tech.

AddThis Social Bookmark Button

Event: AAAI 2008 Special Track on Physically-Grounded AI

February 14th, 2008 Irfan Essa Posted in Events, Service | No Comments »

AAAIPGAIcallI am Co-Chairing  with Drew Bagnell (CMU), Wolfram Burgard (University of Frieberg) a Special Track on Physically-Grounded AI. See AAAI-08: Twenty-Third Conference on Artificial Intelligence, Chicago, IL, USA. The goal of this special track is to bring researhers from computer vision, robotics, machine learning and activity recognition to AAAI in a unified forum. All papers in this track will be full AAAI  papers.

We received around 60 submissions to this track and expect a few NECTAR (new scientific and technical advances in research) submissions too (DUE Feb 18, 2008). The primary track submissions are in process of review).

Abstract Submission Deadline: January 25, 2008 *DONE*
Paper Submission Deadline: January 30, 2008 *DONE*
Author Notification Deadline: April 1, 2008

AddThis Social Bookmark Button

Spring 2008 Term

January 4th, 2008 Irfan Essa Posted in Personal, Research, Teaching | No Comments »

Spring 2008 Term at GA Tech begins Monday 1/7/2009. It will be a busy term with the following activities, in addition to my research related activities.

AddThis Social Bookmark Button

Personal: Read this Book “Three Cups of Tea” By Greg Mortenson and David Oliver Relin

December 4th, 2007 Irfan Essa Posted in Personal | No Comments »

Three Cups of Tea“Three Cups of Tea” By Greg Mortenson and David Oliver Relin
This is a great book. I have just started reading it, but many have recommended it.

Also see

AddThis Social Bookmark Button

Paper: MICCAI (2007) “A Boosted Segmentation Method for Surgical Workflow Analysis”

November 1st, 2007 Irfan Essa Posted in Activity Recognition, Health Systems, Papers, Research | No Comments »

N. Padoy, T. Blum, I. Essa, H. Feußner, M.O. Berger, N. Navab A Boosted Segmentation Method for Surgical Workflow Analysis Proceedings of Medical Image Computing and Computer-Assisted Intervention (MICCAI 2007) (to appear), Brisbane, Australia, Oct. 29 - Nov. 2 2007 (bib)

Abstract

As demands on hospital efficiency increase, there is a stronger need for automatic analysis, recovery, and modification of surgical workflows. Even though most of the previous work has dealt with higher level and hospital-wide workflow including issues like document management, workflow is also an important issue within the surgery room. Its study has a high potential, e.g., for building context-sensitive operating rooms, evaluating and training surgical staff, optimizing surgeries and generating automatic reports. In this paper we propose an approach to segment the surgical workflow into phases based on temporal synchronization of multidimensional state vectors. Our method is evaluated on the example of laparoscopic cholecystectomy with state vectors representing tool usage during the surgeries. The discriminative power of each instrument in regard to each phase is estimated using AdaBoost. A boosted version of the Dynamic Time Warping (DTW) algorithm is used to create a surgical reference model and to segment a newly observed surgery. Full cross-validation on ten surgeries is performed and the method is compared to standard DTW and to Hidden Markov Models.

AddThis Social Bookmark Button

Presentation: CETEE (2007): “Computational Photography & Video: Research & Education”

October 30th, 2007 Irfan Essa Posted in Presentations, Research, Teaching | No Comments »

I was invited to participate and present at the CETEE 2007, Islamabad, November 27-28, 2007.

This meeting has recently been postponed.

AddThis Social Bookmark Button

Presentation: Advanced Visual Interfaces (2008), “Computational Photography and Video: Interacting and Creating with Videos and Images”

October 30th, 2007 Irfan Essa Posted in Computational Photography and Video, Presentations | No Comments »

I have just been invited to give an Invited Talk at Advanced Visual Interfaces (AVI) 2008, May 28-30, 2008, in Napoli, Italy.

Here is a tentative title/abstract for this future presentation. Thanks for the AVI 2008 organizers for inviting me.

Computational Photography and Video: Interacting and Creating with Videos and Images

Abstract

Digital image capture, processing, and sharing has become pervasive in our society. This has had significant impact on how we create novel scenes, how we share our experiences, and how we interact with images and videos. In this talk, I will present an overview of series of ongoing efforts in the analysis of images and videos for rendering novel scenes. First I will discuss (in brief) our work on Video Textures, where repeating information is extracted to generate extended sequences of videos. I will then describe some our extensions to this approach that allows for controlled generation of animations of video sprites. We have developed various learning and optimization techniques that allow for video-based animations of photo-realistic characters. Using these sets of approaches as a foundation, then I will show how new images and videos can be generated. I will show examples of Photorealistic and Non-photorealistic Renderings of Scenes (Videos and Images) and how these methods support the media reuse culture, so common these days with user generated content. Time permitting, I will also share some of our efforts on video annotation and how we have taken some of these new concepts of video analysis to undergraduate classrooms.

AddThis Social Bookmark Button

Paper: Ergonomics in Design (2007), “Designing a Technology Coach”

October 29th, 2007 Irfan Essa Posted in A. Dan Fisk, Activity Recognition, Aware Home, Papers, Wendy Rogers | No Comments »

RogerEssaFisk IconFEATURE AT A GLANCE: Technology in the home environment has the potential to support older adults in a variety of ways. We took an interdisciplinary approach (human factors/ergonomics and computer science) to develop a technology “coach” that could support older adults in learning to use a medical device. Our system provided a computer vision system to track the use of a blood glucose meter and provide users with feedback if they made an error. This research could support the development of an in-home personal assistant to coach individuals in a variety of tasks necessary for independent living.

KEYWORDS: home technology, medical devices, support for learning

AddThis Social Bookmark Button

Paper: IEEE Data Mining Conference 2007 “Detecting Subdimensional Motifs: An Efficient Algorithm for Generalized Multivariate Pattern Discovery”

October 28th, 2007 Irfan Essa Posted in Activity Recognition, Charles Isbell, David Minnen, Papers, Research, Thad Starner | No Comments »

D. Minnen, I. Essa, C.L. Isbell, and T. Starner “Detecting Subdimensional Motifs: An Efficient Algorithm for Generalized Multivariate Pattern Discovery” In IEEE Int. Conf. on Data Mining (ICDM) 2007, Omaha, NE, October 28-31, 2007. [PDF]

Abstract

ICDMPaper Discovering recurring patterns in time series data is a fundamental problem for temporal data mining. This paper addresses the problem of locating subdimensional motifs in real-valued, multivariate time series, which requires the simultaneous discovery of sets of recurring patterns along with the corresponding relevant dimensions. While many approaches to motif discovery have been developed, most are restricted to categorical data, univariate time series, or multivariate data in which the temporal patterns span all of the dimensions. In this paper, we present an expected linear-time algorithm that addresses a generalization of multivariate pattern discovery in which each motif may span only a subset of the dimensions. To validate our algorithm, we discuss its theoretical properties and empirically evaluate it using several data sets including synthetic data and motion capture data collected by an on-body inertial sensor.

AddThis Social Bookmark Button

Awarded the “GVU 15 years of Impact Award”

October 25th, 2007 Irfan Essa Posted in Events, In The News, Research | No Comments »

Jim Foley and Irfan EssaThe Award

Awarded the “GVU 15 years of Impact Award” at GVU 15 Anniversary Celebration and Symposium on October 25, 2007.

AddThis Social Bookmark Button

Event: GVU 15 Anniversay Celebration Symposium (2007)

October 25th, 2007 Irfan Essa Posted in Events, Research | No Comments »

GVU 15 Anniversary Celebration and Symposium

GVU will celebrate “15 Years of Impact” at this anniversary symposium on October 25, 2007.

This special day in GVU history will include:

  • Morning keynote addressed by Andy van Dam and Genevieve Bell.
  • Presentation of 15 GVU Impact Awards for the people, projects and programs that have had significant impact on GVU research activities.
  • An afternoon keynote by GVU Director, Elizabeth Mynatt, outlining GVU’s mission and research program.
  • A demo reception where you will witness the latest innovations at the GVU Center.
  • An outdoor bbq in honor of returning GVU alums and faculty.
AddThis Social Bookmark Button

Paper: ICCV 2007, “Structure from Statistics - Unsupervised Activity Analysis using Suffix Trees”

October 15th, 2007 Irfan Essa Posted in Aaron Bobick, Activity Recognition, Aware Home, PAMI/ICCV/CVPR/ECCV, Papers, Raffay Hamid | No Comments »

Abstract

Models of activity structure for unconstrained environments are generally not available a priori. Recent representational approaches to this end are limited by their computational complexity, and ability to capture activity structure only up to some fixed temporal scale. In this work, we propose Suffix Trees as an activity representation to efficiently extract structure of activities by analyzing their constituent event-subsequences over multiple temporal scales. We empirically compare Suffix Trees with some of the previous approaches in terms of feature cardinality, discriminative prowess, noise sensitivity and activity-class discovery. Finally, exploiting properties of Suffix Trees, we present a novel perspective on anomalous subsequences of activities, and propose an algorithm to detect them in linear-time. We present comparative results over experimental data, collected from a kitchen environment to demonstrate the competence of our proposed framework.

AddThis Social Bookmark Button

Thesis: Mitch Parry PhD (2007), “Separation and Analysis of Multichannel Signals”

October 9th, 2007 Irfan Essa Posted in Audio Analysis, Mitch Parry, PhD, Thesis | No Comments »

Mitch Parry (2007), Separation and Analysis of Multichannel Signals PhD Thesis [PDF], Georgia Institute of Techniology, College of Computing, Atlanta, GA. (Advisor: Irfan Essa)

Abstract

This thesis examines a large and growing class of digital signals that capture the combined effect of multiple underlying factors. In order to better understand these signals, we would like to separate and analyze the underlying factors independently. Although source separation applies to a wide variety of signals, this thesis focuses on separating individual instruments from a musical recording. In particular, we propose novel algorithms for separating instrument recordings given only their mixture. When the number of source signals does not exceed the number of mixture signals, we focus on a subclass of source separation algorithms based on joint diagonalization. Each approach leverages a different form of source structure. We introduce repetitive structure as an alternative that leverages unique repetition patterns in music and compare its performance against the other techniques.

When the number of source signals exceeds the number of mixtures (i.e., the underdetermined problem), we focus on spectrogram factorization techniques for source separation. We extend single-channel techniques to utilize the additional spatial information in multichannel recordings, and use phase information to improve the estimation of the underlying components.

AddThis Social Bookmark Button

Presentation: U of Maryland: “Computational Photography and Video: Spatio Temporal Analysis for Synthesis”

September 25th, 2007 Irfan Essa Posted in Computational Photography and Video, Presentations | No Comments »

Computational Photography and Video: Spatio Temporal Analysis for Synthesis of Novel Images and Videos.

ABSTRACT

Digital image capture and processing has recently had a significant impact on the computer graphics quest for rendering novel scenes. In this talk, I will present an overview of series of ongoing efforts in the analysis of images and videos for rendering novel scenes. First I will discuss (in brief) our work on Video Textures, where repeating information is extracted to generate extended sequences of videos. I will then describe some our extensions to this approach that allows for controlled generation of animations of video sprites. We have developed various learning and optimization techniques that allow for video-based animations of photo-realistic characters. Then I will describe additional approaches for image and video synthesis that builds on optimal patch-based copying of samples. I will show how our methods allow for iterative refinement, with a variety of optimization criteria, and all for extension to synthesis of both images and video from very limited samples. Using these sets of approaches as a foundation, then I will show how new images and videos can be generated. I will show examples of Photorealistic and Non-photorealistic Renderings of Scenes (Videos and Images) and how these methods support the media reuse culture, so common these days with user generated content. Time permitting, I will also share some of our efforts on video annotation and how we have taken some of these new concepts of video analysis to undergraduate classrooms.

AddThis Social Bookmark Button

Paper: ACM HyperText (2007) “The Evolution of Authorship in a Remix Society”

September 15th, 2007 Irfan Essa Posted in Computational Journalism, Nick Diakopoulos, Papers, Research | No Comments »

N. Diakopoulos, K. Luther, Y. Medynskiy, I. Essa (2007) The Evolution of Authorship in a Remix Society, ACM Hypertext 2007 Conference, Manchester, UK, September 2007 Abstract

Authorship entails the constrained selection or generation of media and the organization and layout of that media in a larger structure. But authorship is more than just selection and organization; it is a complex construct incorporating concepts of originality, authority, intertextuality, and attribution. In this paper we explore these concepts and ask how they are changing in light of modes of collaborative authorship in remix culture. We present a qualitative case study of an online video remixing site, illustrating how the constraints of that environment are impacting authorial constructs. We discuss users’ self-conceptions as authors, and how values related to authorship are reflected to users through the interface and design of the site’s tools. We also present some implications for the design of online communities for collaborative media creation and remixing.

  • N. Diakopoulos, K. Luther, Y. Medynskiy, I. Essa. The Evolution of Authorship in a Remix Society. In Proceedings of Hypertext and Hypermedia. Manchester, UK, September 2007[PDF]
  • N. Diakopoulos, K. Luther, Y. Medynskiy, I. Essa. Remixing Authorship: Reconfiguring the Author in Online Video Remix Culture. Georgia Tech, Technical Report. GIT-IC-07-05. 2007. [PDF]
AddThis Social Bookmark Button

Funding: NSF/SGER (2007) “Persistent, Adaptive, Collaborative Synthespians”

September 15th, 2007 Irfan Essa Posted in Charles Isbell, Numerical Machine Learning | No Comments »

Award#0749181 - SGER Collaborative Research: Persistent, Adaptive, Collaborative Synthespians
ABSTRACT

This project explores the development of methodologies for populating worlds with persistent, adaptive, collaborative, believable synthetic actors, referred to as Synthespians. These methods are extensions of adaptive models of learning and planning to accommodate the complex, dynamic environments in massive multi-player online games. The intellectual merit includes the development and evaluation of: 1. A behavior development language, with discovery, machine learning, and adaptation of behaviors directly integrated into the language, allowing for the rapid development and deployment of Synthespians. 2. A framework for the actors to recognize and discover plans by observing and modeling the activities of the other agents. An expected outcome of this research is the ability to author complex virtual worlds with many participants that support intelligent and effective interaction between people and machines. Broader Impact: A scientific understanding of how we interact with each other and collaborate will benefit from our ability to simulate complex environments with dynamic and evolving individual and group behaviors. In this project, building and modeling such environments and behaviors is done within a gaming context. This work will in the long run effect and change the fields of education and entertainment. In addition, being able to model large collaborative and interactive scenarios will also help us understand and model large social dynamics phenomenon of interest to sociologists and economists.

AddThis Social Bookmark Button

Paper: AAAI 2007: “Discovering Multivariate Motifs using Subsequence Density Estimation and Greedy Mixture Learning”

August 24th, 2007 Irfan Essa Posted in Activity Recognition, Charles Isbell, David Minnen, Papers, Research, Thad Starner | No Comments »

Discovering Multivariate Motifs using Subsequence Density Estimation and Greedy Mixture Learning

Abstract

The problem of locating motifs in real-valued, multivariate time series data involves the discovery of sets of recurring patterns embedded in the time series. Each set is composed of several non-overlapping subsequences and constitutes a motif because all of the included subsequences are similar. The ability to automatically discover such motifs allows intelligent systems to form endogenously meaningful representations of their environment through unsupervised sensor analysis. In this paper, we formulate a unifying view of motif discovery as a problem of locating regions of high density in the space of all time series subsequences. Our approach is efficient (sub-quadratic in the length of the data), requires fewer user-specified parameters than previous methods, and naturally allows variable length motif occurrences and nonlinear temporal warping. We evaluate the performance of our approach using four data sets from different domains including on-body inertial sensors and speech.

AddThis Social Bookmark Button

Paper: IEEE CVPR (2007) “Tree-based Classifiers for Bilayer Video Segmentation”

June 17th, 2007 Irfan Essa Posted in Antonio Crimisini, Computational Photography and Video, John Winn, Numerical Machine Learning, Papers, Pei Yin, Research | No Comments »

Tree-based Classifiers for Bilayer Video Segmentation (IEEE Explor)

Yin, Pei Criminisi, Antonio Winn, John Essa, Irfan
School of Interactive Computing, Georgia Institute of Technology, Atlanta, GA 30332, USA
This paper appears in: Computer Vision and Pattern Recognition, 2007. CVPR ‘07. IEEE Conference on
Publication Date: 17-22 June 2007
On page(s): 1 - 8
Number of Pages: 1 - 8
Location: Minneapolis, MN, USA
ISBN: 1-4244-1180-7
Digital Object Identifier: 10.1109/CVPR.2007.383008
Posted online: 2007-07-16 13:18:42.0

Abstract

This paper presents an algorithm for the automatic segmentation of monocular videos into foreground and background layers. Correct segmentations are produced even in the presence of large background motion with nearly stationary foreground. There are three key contributions. The first is the introduction of a novel motion representation, “motons”, inspired by research in object recognition. Second, we propose learning the segmentation likelihood from the spatial context of motion. The learning is efficiently performed by Random Forests. The third contribution is a general taxonomy of tree-based classifiers, which facilitates theoretical and experimental comparisons of several known classification algorithms, as well as spawning new ones. Diverse visual cues such as motion, motion context, colour, contrast and spatial priors are fused together by means of a Conditional Random Field (CRF) model. Segmentation is then achieved by binary min-cut. Our algorithm requires no initialization. Experiments on many video-chat type sequences demonstrate the effectiveness of our algorithm in a variety of scenes. The segmentation results are comparable to those obtained by stereo systems.

AddThis Social Bookmark Button

Event: CVPR 2009 Conference. Miami FL, USA.

June 15th, 2007 Irfan Essa Posted in Events, Service | No Comments »

We proposed to organize CVPR 2009 Conference in Miami FL, USA in June 2009.

AddThis Social Bookmark Button

Talk: Keynote at WIAMIS 2007 “Data-driven and Procedural Analysis and Synthesis of Multimedia”

June 14th, 2007 Irfan Essa Posted in Computational Photography and Video, Presentations | No Comments »

WIAMIS 2007: “Data-driven and Procedural Analysis and Synthesis of Multimedia”

Abstract

In this talk, I will outline the changes that have come about in the analysis and synthesis of multimedia, due to the availability of large amounts of data. I will present several of the recently successful methods that have been introduced in the last few years for example-based synthesis for animation and rendering of videos. I will also show how these methods have been extended to other modalities. I will also show how these approaches need to be extended by developing parametric and procedurals models to represent temporal variations. Using example from my groups work and also other efforts, I will discuss how video is becoming an accessible medium for all and I will also discuss some newer work on authoring of multimedia content.

AddThis Social Bookmark Button

Event: GVU Demo Videos (For Turner) May 15, 2007

May 15th, 2007 Irfan Essa Posted in Events, Research, Teaching | No Comments »

Here are 3 Videos that show me talking about our different efforts to visitors for the Turner Day at the GVU Center.

These videos were recorded by a Video Crew from Turner.  Thanks to them for sharing these.

AddThis Social Bookmark Button

Showcase: DVFX 2007 Video Productions

April 26th, 2007 Irfan Essa Posted in DVFX | No Comments »

DVFX 2007 Video Productions

Final Screening for CS4480 (Digital Video Special Effect) Course, Spring 2007 was held at April 26, 2007 in TSRB (85 5th Street NW, Altanta, GA 30308) at 12n. See the Videos at DVFX 2007 Video Productions and all the details about the productions.

AddThis Social Bookmark Button

Paper: IEEE ICASSP (2007) “Incorporating Phase Information for Source Separation via Spectrogram Factorization”

April 15th, 2007 Irfan Essa Posted in Audio Analysis, Mitch Parry, Papers, Research | No Comments »

Incorporating Phase Information for Source Separation via Spectrogram Factorization

Parry, R.M. Essa, I.
Coll. of Comput., Georgia Inst. of Technol., Atlanta, GA
This paper appears in: Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on
Publication Date: 15-20 April 2007
Volume: 2
On page(s): II-661 - II-664
Number of Pages: II-661 - II-664
Location: Honolulu, HI
ISSN: 1520-6149
ISBN: 1-4244-0728-1
INSPEC Accession Number:9497202
Digital Object Identifier: 10.1109/ICASSP.2007.366322
Posted online: 2007-06-04 10:15:41.0

Abstract

Spectrogram factorization methods have been proposed for single channel source separation and audio analysis. Typically, the mixture signal is first converted into a time-frequency representation such as the short-time Fourier transform (STFT). The phase information is thrown away and this spectrogram matrix is then factored into the sum of rank-one source spectrograms. This approach incorrectly assumes the mixture spectrogram is the sum of the source spectrograms. In fact, the mixture spectrogram depends on the phase of the source STFTs. We investigate the consequences of this common assumption and introduce an approach that leverages a probabilistic representation of phase to improve the separation results

AddThis Social Bookmark Button

Paper: ACM IWVSSN (2006) “Unsupervised Analysis of Activity Sequences Using Event Motifs”

October 23rd, 2006 Irfan Essa Posted in AAAI/IJCAI/UAI, Aaron Bobick, Activity Recognition, Aware Home, Papers, Raffay Hamid, Siddhartha Maddi | No Comments »

  • R. Hamid, S. Maddi, A. Bobick, I. Essa. “Unsupervised Analysis of Activity Sequences Using Event Motifs”, In proceedings of 4th ACM International Workshop on Video Surveillance and Sensor Networks (in conjunction with ACM Multimedia 2006).

Abstract

We present an unsupervised framework to discover characterizations of everyday human activities, and demonstrate how such representations can be used to extract points of interest in event-streams. We begin with the usage of Suffix Trees as an efficient activity-representation to analyze the global structural information of activities, using their local event statistics over the entire continuum of their temporal resolution. Exploiting this representation, we discover characterizing event-subsequences and present their usage in an ensemble-based framework for activity classification. Finally, we propose a method to automatically detect subsequences of events that are locally atypical in a structural sense. Results over extensive data-sets, collected from multiple sensor-rich environments are presented, to show the competence and scalability of the proposed framework.

AddThis Social Bookmark Button

Paper: ACM UIST (2006) “Videotater: an approach for pen-based digital video segmentation and tagging”

October 15th, 2006 Irfan Essa Posted in Computational Photography and Video, Nick Diakopoulos, Papers, Research | No Comments »

Diakopoulos, N. and Essa, I. (2006). Videotater: an approach for pen-based digital video segmentation and tagging. In Proceedings of the 19th Annual ACM Symposium on User interface Software and Technology (Montreux, Switzerland, October 15 - 18, 2006). UIST ‘06. ACM Press, New York, NY, 221-224. [DOI]

Abstract

The continuous growth of media databases necessitates development of novel visualization and interaction techniques to support management of these collections. We present Videotater, an experimental tool for a Tablet PC that supports the efficient and intuitive navigation, selection, segmentation, and tagging of video. Our veridical representation immediately signals to the user where appropriate segment boundaries should be placed and allows for rapid review and refinement of manually or automatically generated segments. Finally, we explore a distribution of modalities in the interface by using multiple timeline representations, pressure sensing, and a tag painting/erasing metaphor with the pen.

AddThis Social Bookmark Button

Paper: IEEE ISWC (2006) “Discovering Characteristic Actions from On-Body Sensor Data”

October 14th, 2006 Irfan Essa Posted in Activity Recognition, Charles Isbell, David Minnen, Papers, Research, Thad Starner | No Comments »

Discovering Characteristic Actions from On-Body Sensor Data (IEEEXplore)

Minnen, D. Starner, T. Essa, I. Isbell, C.
College of Computing, Georgia Institute of Technology, Atlanta, GA 30332 USA. dminn@cc.gatech.edu
This paper appears in: Wearable Computers, 2006 10th IEEE International Symposium on
Publication Date: Oct. 2006
On page(s): 11 - 18
Number of Pages: 11 - 18
Location: Montreux, Switzerland
ISSN: 1550-4816
ISBN: 1-4244-0598-x
Digital Object Identifier: 10.1109/ISWC.2006.286337
Posted online: 2007-01-22 09:58:15.0

Abstract

We present an approach to activity discovery, the unsupervised identification and modeling of human actions embedded in a larger sensor stream. Activity discovery can be seen as the inverse of the activity recognition problem.