Data from neuroscience is fiendishly complex. Neurons exhibit correlations on very long timescales and across large populations, and the activity of individual neurons is difficult to extract from noisy experimental data. I will present work on several projects to address these issues, both abstract and applied. First I will discuss the Probabilistic Deterministic Infinite Automata (PDIA), a nonparametric model of discrete sequences such as natural language or neural spiking. The PDIA explicitly enumerates latent states that are predictive of the future, and by using a Hierarchical Dirichlet Process prior can learn arbitrary transitions between those states. The model class learned by the PDIA is smaller than hidden Markov models but yields superior predictive performance on data with strong history dependence, like text. One weakness of the PDIA is that it is hard to scale when the space of possible observations is very large, as is the case with large populations of neurons. In this limit we are instead interested in reducing the dimensionality of data, and I will present work on unifying the generalized linear model (GLM) framework in neuroscience with dimensionality reduction. The resulting models can be efficiently learned using convex techniques from the matrix completion literature, and can be combined with spectral methods to learn surprisingly accurate models of the dynamics of real neural data. To apply these models to the kinds of high-dimensional neural data now becoming available, we have to bridge the gap between raw data and units of neural activity. I will present joint work with Misha Ahrens and Jeremy Freeman on extracting neural activity from whole-brain recordings in larval zebrafish, as a step towards the long-term goal of making dynamics modeling a daily part of the data analysis routine in neuroscience