Topological Modeling of the Visual System

The primary visual cortex (V1) is commonly held as a exemplar for general neural behavior with the additional benefit of reacting to well characterized stimuli. We calculate the abstract simplicial complex of spikes in V1 neurons for different stimuli and study the homology of the resulting surfaces to classify neurons by their stimuli response. The strength of this technique arises from the lack of a dependence on a coordinate plane; datasets collected from different animals, using different stimuli, or even from different areas of the brain can be readily compared.

In addition to classification, we are using topology to differentiate neurons that are thought to exhibit feed-back as well as feed-forward information transmission.

Publications

A. Wildani, T. O. Sharpee, Persistent Homology for Characterizing Stimuli Response in the Primary Visual Cortex, International Conference of Machine Learning (ICML) Workshop on Topology. June 2014.

Data Grouping for Storage

Most systems treat each data request as an independent event. However, such requests in a computer system are driven by programs and user behaviour, and are therefore far from random. My thesis research focused on techniques for identifying working sets in dynamic workloads using minimal trace data. We showed that with just block I/O trace data, which can be collected non-intrusively through analyzing the storage bus, we can detect groups of co-accessed data in real time. We studied multiple aspects of predicting data access behavior, and the identification and exploitation of such behavior to produce: predictive caches, improved access predictors, informed data layout, and the automated grouping of related data. Group identification can be used to proactively move data to reduce power consumption, improve deduplication performance, and isolate faults. Systems based on our technique have been implemented at Pure Storage and IBM.

Publications

A. Wildani, E. L. Miller, I. F. Adams, D. D. E. Long. PERSES: Data Layout for Low Impact Failures, To appear in the Proceedings of the 22th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2014), September 2014.
A. Wildani, E. L. Miller, O. Rodeh, HANDS: A Heuristically Arranged Non-Backup In-line Deduplication System, Proceedings of the 29th IEEE International Conference in Data Engineering (ICDE 2013), April 2013. (pdf, BibTeX)
A. Wildani, L. Ward, E. L. Miller, Efficiently Identifying Working Sets in Block I/O Streams, Proceedings of the 4th Annual International Systems and Storage Conference (SYSTOR 2011), May 2011. (pdf, BibTeX)
A. Wildani, E. L. Miller, Semantic Data Placement for Power Management in Archival Storage, Proceedings of the 5th International Workshop on Petascale Data Storage (PDSW 2010), held in conjunction with SC2010, November 2010. (pdf, BibTeX)

Workload Characterization

Storing data for the long term is a complex balance of economics, curation, and monitoring. There are two components to this project. Most organizations consider static traces easy to collect because of their relatively high anonymizability and low performance impact for collection. We are studying how to characterize different classes of workloads using static trace data to optimize migration timing as well as discover enough workload characteristics to optimize the type of storage to build or grouping to do.

The other aspect to this project is ensuring that we can store data over long periods of time and not run out of money to migrate or otherwise maintain the data. We explore economic models to calculate the optimal endowment given assumptions about reliability requirements and the rate of decrease of the storage media cost per byte. These are both active projects with colleagues at UC Santa Cruz and Seagate.

Publications

P. Gupta, A. Wildani, D. Rosenthal, E. L. Miller, et al. An Economic Perspective of Disk vs. Flash Media in Archival Storage , To appear in the Proceedings of the 22th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2014), September 2014.
A. Wildani, I. F. Adams, E. L. Miller, Single-Snapshot File System Analysis, Proceedings of the 21st IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2013) August 2013. (BibTeX)

Reliability

A 10 PB system with an annual loss rate of 0.001% can still expect to lose a terabyte of data every decade. Loss of data is stochastic and tends to occur on a whole-device level. Yet, as devices get larger most individual files remain fairly small, causing each failure event to impact more files. Through isolating faults within working sets, adding large stripe parity, and designing erasure codes for flash devices, we are working towards understanding and improving storage reliability and availability in multi-user, multi-applications systems to reduce the effect of failures on working sets of files or blocks and thus improve net productivity.

Publications

A. Wildani, E. L. Miller, I. F. Adams, D. D. E. Long. PERSES: Data Layout for Low Impact Failures, To appear in the Proceedings of the 22th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2014), September 2014.
A. Wildani, T. Schwarz, E. L. Miller, D. Long, Protecting Against Rare Event Failures in Archival Systems, Proceedings of the 17th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2009), September 2009. (pdf, BibTeX)
K. Greenan, D. Long, E. L. Miller, T. Schwarz, A. Wildani, Building Flexible, Fault-Tolerant Flash-based Storage Systems, Proceedings of the Fifth Workshop on Hot Topics in System Dependability (HotDep 2009), June 2009. (pdf, BibTeX)

Neural modeling

I am currently researching the intersection of storage modeling and network modeling within the brain, particularly redundancy and error correction for fault tolerance. I am also involved in a project to translate computational neuroscience methods to analyze motor neuron and glial interactions in freely moving mice with the Nimmerjahn lab at the Waitt Advanced Biophotonics Center at Salk.

All publications

Miscellaneous Older Projects

Information Retention
I've worked in information survival on individualized P2P reputation networks and large-stripe parity in MAID-based archival systems. I am always looking for people to collaborate with, particularly on information survival.
Spectral Clustering
For my masters work, I studied a dataset composed of fMRI images and catagorical survey responses to explore the possibility of discovering and further categorizing subtypes of schizophrenia. We used spectral clustering and focused on estimation for missing or corrupt variables.

Topological Modeling of the Visual System

Publications

Data Grouping for Storage

Publications

Workload Characterization

Publications

Reliability

Publications

Neural modeling

Miscellaneous Older Projects

Information Retention

Spectral Clustering