Ciclo de Palestras 2005 – 1º Semestre

Palestras do Departamento de Metodos Estatísticos - Instituto de Matemática - UFRJ

As palestras são realizadas na sala C116 do Centro Tecnológico as 15:30 hs, a menos que ocorra aviso em contrário


In many types of field experiment it is likely that the treatment applied to one plot will affect neighbouring plots as well as the plot to which it is applied. Such an effect is variously called a neighbour effect, or interference between treatments, or competition between treatments. Where such effects exist, we need not only to model them in the analysis but also to design the experiment with the neighbour effects in mind. This talk will give examples from various different types of treatment, discuss some of the issues, and give some recommendations for design, some derived theoretically, some more ad hoc.

Será apresentado um resumo dos principais métodos para estimação dos parâmetros dos modelos da teoria da Resposta ao item. Em seguida, será introduzido o problema do comportamento diferencial do Item e serão apresentados sucintamente os principais métodos de detecção. Finalmente, propõe-se modelos hierárquicos para a detcção e, ou, a confirmação do comportamento diferencial apresentado por um mesmo conjunto de itens associados a um mesmo tema. Indicam-se abordagens possíveis, baseadas em MCMC, para a estimação desses modelos.


Neste trabalho analisamos aspectos temporais, espaciais e espaço-temporais dos homicídios ocorridos nos municípios da Região Sudeste do Brasil, nos anos de 1979 a 1998. Diversos tipos de técnicas foram utilizadas, tais como modelos de series temporais, analise espacial, e analise exploratória espaço temporais utilizando animação de mapas e superficies. Também foram utilizados modelos GAMM bayesianos com extensão espaço temporais. O emprego dessas múltiplas técnicas permitiu observar a tendência crescente dos homicídios na região sudeste e seu espalhamento ao longo do período estudado. Os modelos também evidenciaram mais de um tipo de processo espaciais e temporais ocorrendo nos estados da Região Sudeste.


Lately, there has been an increasing interest in finding more flexible and realistic models and evidence measures to represent adequately as possible the features of the data. In this conference we review recent developments in Bayesian Inference based on skew-normal distributions, entropy for modelling dependent variables and evidence test for precise hypotheses. The skew-normal distribution ,which includes the uni-variate normal distribution as a special case , can be exploited in Bayesian analysis both for modelling prior beliefs and asymmetric observations. The book recently edited by Genton (2004) shows a collection of applications in different areas. In this conference, a new extended skew-measurement error model is introduced with application on real data via Winbugs. Some basic concepts of the Maxent theory are presented in order to characterize the association among variables observed in the same individual (called “frailty entropy”) and an application on real recorrent data is also considered. Finally, the conference is ended by introducing a “fully ” Bayesian method for precise hypotheses (Pereira and Stern, 1999) to compute the evidence of the Poisson distribution against the Zero-In‡ated Poisson distribution (ZIP). This measure is intuitive and easy to implement via Winbugs as an alternative to the classical p-value tests.


Genomics research is generating vast molecular sequence data ranging from single genes to whole genomes across an increasing number of species. However, fundamental difficulty in evolutionary studies emerges as the availability of sequences expands. Phylogenetics methods to reconstruct the evolutionary tree relating the sequences traditionally condition on a single, sometimes poorly estimated sequence alignment, where an alignment specifies which residues in the sequences derive from a common origin. This conditioning can cause bias and inappropriate infer in genomic studies, particularly when the sequences are highly diverse. For example, the early branching-order of Bacteria, Archaea and Eukaryotes, the three major domains of life, is troublesome to determine. As a solution, I describe a novel Bayesian model for simultaneously estimating alignments and the phylogenetic trees that relate the sequences. This sidesteps the bias issue inherent in sequential estimation. Joint estimation also allows one to model rate variation between sites when estimating the alignment and to use the evidence in shared insertion/deletions (indels) in the sequences to group sister species in the tree. I base this indel process on a Hidden Markov Model that makes use of affine gap penalties and considers indels of multiple residues. I develop a Markov chain Monte Carlo (MCMC) method to sample from the posterior of the joint model, estimating the most probable alignment and tree and their support simultaneously. I describe a new MCMC transition kernel based on the Forward-Backward algorithm and a careful choice of parameter marginalization that improves our algorithm’s mixing efficiency, allowing the MCMC chains to converge even when started from arbitrary alignments. Finally, my software implementation can estimate alignment uncertainty and I describe a method for summarizing this uncertainty in a single plot.


In this work we use Bayesian methods to solve a parameter estimation (inverse) problem in groundwater contamination. The problem consisted of estimating the values of water velocity and dispersion parameters of an analytical solution to an advection-dispersion differential equation. Following standard practice in this field, this equation was assumed to provide a good mathematical description of the groundwater system relevant to our work. The experimental data were obtained by a tracer experiment which consisted in a two-well recirculation tracing test, and was performed using an extraction-injection well couplet. Previous work in this field has largely been confined to the use of maximum-likelihood techniques, which depend upon large-sample approximations to provide measures of uncertainty for parameter estimates. We pose the inverse problem in a Bayesian framework: we use diffuse prior distributions and the values of observed tracer concentrations versus time to obtain posterior distributions for the parameters of interest, employing Markov chain Monte Carlo simulation techniques to explore the posterior distributions for the parameters. Our principal finding is that, even with appropriate uncertainty bands on all of the relevant parameters, the standard advection-dispersion differential equation used in this field to model groundwater systems like the one we studied is inadequate to fit the observed data. This work demonstrates that Bayesian methods can produce highly informative results in the field of groundwater contamination modeling.


(Resumo extraído do Relatório Técnico IME-USP MAE 05/15 Pereira, Stern e Weshler (2005) “Can a significance test be genuinely Bayesian?” )
This paper reviews of the Full Bayesian Significance Test, FBST. The original and the invariant versions of the FBST evidence values are discussed in detail and compared with alternative procedures: Neyman-Pearson-Wald based p-values and Bayes factor based posterior probabilities. Three standard statistical problems, independence in contingency tables, Behrens-Fisher, and variable selection in linear egression, are used to illustrate the FBST advantages and versatility.

A existência de transições de fases eh bem definida, no padrão da mecânica estatística, para sistemas cujas interações são SOMÁVEIS. Para tais sistemas, a ocorrência de transições de fases depende fortemente da temperatura. Quando as interações são não- somáveis, essa dependência é trivial, e nós vamos mostrar que o sistema sempre pode ser truncado. Isto é, consideremos um modelo ferromagnético onde todas as interações entre partículas afastadas de mais de N unidades são desprezadas. Para qualquer temperatura T, acontece o seguinte: quando N vai para o infinito, as medidas de Gibbs do sistema em equilíbrio (de temperatura T) convergem fracamente para as medidas de Dirac associadas aos estados fundamentais do sistema. Quer dizer: o modelo truncado sempre tem transição de fase quando N é grande o suficiente. Esse resultado, cuja prova usa métodos de percolação, mostra que sistemas de longo alcance não-somáveis tem comportamento termodinâmico trivial.


This talk will provide a treatment of overdispersion in generalized linear models, present a survey of the commonly used models, discuss general ideas of estimation and inference and consider the practical application of the methods in applied statistics. Applications will be drawn from biometry, biostatistics and social science.

It is well known that financial volatilities move frequently together through time and then, it is important to find appropriate models to understand and predict the temporal dependence in multivariate high-frequency data. Although widely used, the correlation matrix is not a well measure of dependence when normality assumptions do not hold and it can be misleading for financial investors.
Copula is one of the most promising tools to describe the dependence between random variables. The main idea behind copula is to separate the marginal behaviour and dependence structure from joint distribution. Previous works using copulas in dynamic models does not account for the parameter uncertainty simultaneously. Proposed methods use 2-steps ML approaches which lead to consistent but not efficient estimators.
We propose a Bayesian fully analysis that simultaneously access posterior uncertainty for the parameters of the marginals and the parameters of the dependence structure defined by a copula function.