SAXS How-To

Biological Synchrotron SAXS/WAXS (Small Angle X-ray Scattering/ Wide Angle X-ray Scattering)

There are many other websites which discuss SAXS. A few excellent places to start are the European SAXS website , Manfred Kriechbaum’s website and even Wikipedia.  The practical parts of this guide are based on the Australian Synchrotron SAXS beamline.

NEWS!: Senior beamline scientist Nigel Kirby has created a comprehensive guide to the Australian Synchrotron SAXS beamline.  Download it here.

Please see here if you require SAXS data analysis or experimental design consultation. 

1. Solution SAXS

1.1 Introduction: SAXS for biological shape determination
Small angle scattering (SAS) is emerging as a powerful tool for the study of biomolecules and their complexes in solution (Wall et al., 2000). The ability to study molecules in solution not only obviates the necessity for crystallisation, which can be an arduous task, but also allows the researcher to examine the molecule under conditions that allow greater structural flexibility and multiple conformational forms (Wall et al., 2000, Millett et al., 2002). This is not to say that SAS should replace crystallographic techniques, which still provide the best atomic structure information for a protein. Rather, the power of SAS lies in the facile determination of the shape and dimensions of native, unfolded and complexed proteins and in the ability to follow changes in these parameters as they occur (Millett et al., 2002, Wall et al., 2000, Trewhella, 1997).

SAS can be used to investigate the shape of both ligand -bound and uncomplexed proteins in solution. The technique was recently used to demonstrate that a family of receptors for bacterial chemotaxis share a two domain structure with a hinge region that allows the opening and closing of a ligand binding cleft (Shilton et al., 1996). Another investigation used a comparison of SAS data with the crystal structures of the aspartate transcarbamoylase tense and relaxed state oligomers to demonstrate the ability of the relaxed state to undergo much larger subunit movements than could be observed under the constraints of crystals (Ke et al., 1988). SAS data also helped to shed light on the dimerisation of RAG-1 (a protein involved in recombination events that control the diversity of T-cell receptor and immunoglobulin chains) by providing information on the shape of the dimerisation domain and the orientations of the monomeric subunits within the dimer (Rodgers et al., 1996).

SAS can also provide information regarding the folding pathways and key structural residues and regions of proteins. Recent examples include the use of SAS as a simple test of structural integrity for a group of creatine kinase mutants (Forstner et al., 1997) and as a measure of stability for a group of streptococcal protein G mutants undergoing guanidinium chloride mediated denaturation (Smith et al., 1996). Chen and colleagues (Chen et al., 1996) have also used SAS data to identify a partially folded kinetic intermediate consisting of a molten globule domain and a disordered region that forms during the folding of lysozyme.

1.2 The Basics
When a plane wave, such as that from an X-ray beam, interacts with a sample, secondary coherent waves (defined as those that have additive amplitudes and, hence, can produce interference) are scattered. If elastically scattered, these waves interfere in a manner that can be related to the spatial distribution of the atoms in the sample (Creighton, 1999). Larger particles produce scattering at smaller angles, making SAS a useful tool for investigating the large scale properties of a biomolecule as opposed to its atomic structure.

Data are collected as intensities (I) recorded over a range of angles (ϴ). For mathematical convenience, the data are converted to an intensity function (I (q)) related to the scattering vector amplitude (q) such that

q = 4π(sin ϴ)/λ,

where λ is the wavelength of the X-ray radiation. For monodisperse systems, where particles are randomly distributed with uncorrelated positions and orientations, I(q) is a continuous isotropic function resulting from the sum of the scattering amplitudes of atoms within a finite volume element divided by its volume. I(q) is, therefore, proportional to the scattering from a single particle averaged over all orientations. I(q) can be related to the molecular mass and radius of gyration, Rg, of the molecule by the Guinier equation (Guinier, 1939), which is graphically represented by the Guinier plot (ln I(q) vs q2). For monodisperse systems, the plot is linear and the molecular mass and Rg are provided by the I(0) intercept and the slope of the line, respectively (Svergun and Koch, 2003). A linear plot of log I(q) vs log q can be used to determine the form of a polypeptide sample, where a slope of approximately 2 indicates a Gaussian chain; 1.66 a chain with excluded volume; and 1 a rod shape.

By means of Fourier transform, I(q) can be converted into the real space pair distance function, P(r). P(r) represents the frequency of vector lengths connecting small volume elements within the scattering particle, weighted by their scattering densities. The largest of these vectors, corresponding to the maximum dimension of the particle (Dmax), will have the smallest frequency. The maximum dimension can, therefore, be identified by the value at which the function approaches zero (Creighton, 1999). For simple shapes, P(r) can provide a straightforward and intuitive representation of the data for visual inspection.

More complex shapes, often encountered in proteins, can not easily be interpreted from visual inspection of P(r). Instead, computer based methods must be employed to identify models that can satisfy the scattering data. Early methods used an angular envelope function, parameterised using spherical harmonics to represent the particle shape (Svergun and Stuhrmann, 1991, Svergun et al., 1996), but today, dummy atom models (DAMs), such as DAMMIN are more commonly used (Chacon et al., 1998, Svergun, 1999, Svergun et al., 2001). DAMs start with a defined volume that is filled with M densely packed spheres of a smaller radius. Each sphere may belong to the particle (index = 1) or to the solvent (index = 0) and the shape is described by a binary string X of length M. The model building process begins with a random assignment of 1s and 0s, which are randomly relocated until a solution of best fit (determined by error values) is found. Constraints are usually placed on the distribution of the dummy atoms, either to ensure compactness (Svergun, 1999) or to ensure spatial compatibility with a peptide chain, as in the case of GASBOR (Svergun et al., 2001). In some cases, several solutions will provide a good fit for a single SAS data set. For this reason, the software packages SUPCOMB, DAMAVER and DAMFILT respectively allow the alignment of separate models and the creation of an averaged, filtered, ‘most-probable’ model (Kozin and Svergun, 2001, Volkov and Svergun, 2003).

1.3 Preparing protein samples for SAXS
Here are my suggestions when preparing a sample for SAXS analysis.
  • Bring some pure filtered water for normalisation
  • Filter your buffer at least through a 0.22 μm filter.
  • Make sure you always use that exact buffer as a blank, not another lot of the buffer made up to the same recipe. Mild differences will lessen your confidence in the background subtraction.
  • For lab source SAXS, you should aim to have a 10 mg/ml solution and you can do dilutions of that. For synchrotron SAXS you wont need such a high concentration, but it’s probably worth starting at 5 mg/ml and making some dilutions.
  • Do light scattering first! Light scattering will tell you if your solutions likely have aggregates and may even tell you if you have more than one conformation within a monomeric sample.
  • If you find aggregation is a problem, you can try a molecular weight cut off filter (like these ones), but do remember the cut off is nominal and will depend upon the shape of your protein. It’s best to test the filtration using light scattering.
  • If you still have aggregation problems, you may want to reduce the salt concentration of your buffer or change the pH or you may be able to avoid aggregation by keeping the solution cool.

1.4 Analysing the data from protein solution SAXS
  • The SAXS beamline at the Australian synchrotron (as well as ChemMatCARS in Chicago) uses SAXS15id. You can download this software at http://www.synchrotron.org.au/index.php/aussyncbeamlines/saxswaxs
  • Start by opening SAXS15id, then opening your parameters file (.sax) and your log file (.log).
  • First open an air shot file.  Go to Data -> normalisation and click to calibrate to air shot.
  • Next, open each of your empty capillary data files into the blank channels. Do this by making sure the radio button for current read in is on blank, then go to file → process SAXS image → Single image and choose your first empty capillary data file. Next, click the radio button ‘2’ on the current read in row and select the next file (you can do this using the sliding bar). Continue with 4, 6 etc.
  • Open your water data files into the sample channels (1,3,5 etc)
  • Turn subtraction on by checking the box for blank, 2 etc on the blank subtractor row.
  • Zoom in to the flat part at the end of the curves. Do this by pressing the in button next to zoom plot
  • Go to Data -> normalisation, either choose Normalise I0 & transmission OR for the Australian Synchrotron SAXS beamline, choose Beamstop.
  • Then under use click calibrate (it takes you to graphs- choose any dodgy channels as unused and make sure of good fits)
  • Click use to both options at the top of the window
  • Test calibration gradient and blank gradient, click to all profiles
  • You have now normalised your data. Save your new saxs parameters file with a new name (Go to Data → Save SAXS parameters).
  • Now that you have normalised your data, you can check for aggregation using your I(0) value. You will need the extinction coefficient of your protein and the A280 of the sample you used to get the data (measured after filtration if you filtered). The method is described by Orthaber et al 2000. In summary, you use the following formula:
            mw = I(0)(Avogadro’s number) / {(Δσ * v)2 * c}
                Where Δσ = difference in scattering length density between sample and buffer; v = specific volume of sample in (cm^-3g) and c = concentration in (g/cm^3)
  • Next open your buffer data files into the blank channels and your protein data files into the sample channels and subtract backgrounds.
  • Go to save File → save profiles. I usually do one at a time so that’s N profiles to N files.
  • Once you have exported your files, you can open them in Igor or in PRIMUS.
  • You can get a free demo version of Igor if you only intend to use it for a short period of time, but you will need to buy a license if you intend to use it for many months. Go to http://www.wavemetrics.com/
  • You will need to get the IRENA and the NIST add-ons. Fist go to File → Open experiment and choose the NIST experiment file you need (e.g. SANS_Reduction_v5.1.pxt), then go to Macros → Load IRENA macros
  • Igor is now ready. I suggest you save experiment now as ready.pxp or something similar so you don’t have to waste time loading the ad-ons each time. Go to SAS → Data import export (this will open a window), select your data path, choose your file, click Qvec as column 1, Int as column 2 and Err as column 3. Click test. Click use QRS wave names, then click import. Now you can make whatever plots interest you. E.g. A linear plot of log I(q) vs log q can be used to determine the form of a polypeptide sample, where a slope of approximately 2 indicates a Gaussian chain; 1.66 a chain with excluded volume; and 1 a rod shape. A Guinier plot (ln I(q) vs q^2) can be used to provide the radius of gyration (based on the slope at low q). The error should be less than 1.3.
  • For PRIMUS, Open tools → Data processing, press select and choose your file from the browser then press plot. By using the up/down arrows next to beginning and end you can truncate your data. There are all kinds of plots available, just click through the buttons at the bottom and truncate your data according to the needs of the plot (e.g. for Guinier use only the low q end). Primus can also be used to manipulate your data (e.g. buffer subtraction and averaging). Just use the bottom row to give your output file a name.
  • Once you have decided on any data that needs to be truncated and have determined the Rg, you can use GNOM to determine a few more parameters of your data and to generate a P(r) function. You can find a manual for it on the EMBL website, but really all you need to do is follow the prompts, go for minimum errors and an excellent solution.
  • Now that you have an output from GNOM, you can use DAMMIN, DAMMIF or GASBOR to generate models. DAMMIN will aim to give you a more compact solution, so use it when you think your protein is fairly globular. Gasbor uses chain connectivity rather than compactness so can be more appropriate where you expect a large aspect ratio (you should have an idea from the Guinier and rod plots what your aspect ratio might be and from your P(r) shape you should be able to predict what sort of basic shape your protein resembles). Be aware that Gasbor will take a long time for large proteins. DAMMIF is similar to DAMMIN but does not impose a maximum dimension (so avoids edge effects such as bending).
  • The predicted length, radius and aspect ratio of the protein can be calculated from the Guinier plots for globular and rod shaped particles using the following equations (Creighton, 1999).
            Rg^2 – Rc^2 = L^2/12
            Rc = R/√2
            A = L/2R
            where Rg is the radius of gyration, taken from the Guinier plot for globular particles, Rc is the radius of gyration of cross-section, taken from the Guinier                plot for rod shaped particles, R is the radius, L is the length of the rod and A is the aspect ratio.
  • You should always create at least 10 independent models and then average them using DAMAVER.
  • You can also model mixtures of different monomeric conformers (MIXTURE), model samples with various oligomeric states (OLIGOMER) and model complexes of known structures (MASSHA). All of this software can be downloaded from http://www.embl-hamburg.de/ExternalInfo/Research/Sax/software.html


2 Fibre (Fiber) X-ray Diffraction

2.1 Preparing fibre diffraction samples for SAXS
  • It’s worth preparing your fibre diffraction samples in a number of different ways and if you can get hold of a lab source instrument, see which samples give some diffraction
  • If you are forming the fibre in vitro, try as many different buffer, time, temperature etc conditions as you can as well as things like crowding agents if needs be. Remember to bring a blank- although you probably wont buffer subtract, you’ll need a reference so you know which peaks arise from salts- these will mostly show up in your WAXS patterns.
  • If you want to look at natural samples, like biological tissues, it is also worth trying different ways of preparing your sample. As well as fresh ‘chunks’ of tissue, it’s worth bringing tissue sections of various thicknesses and cut from different planes, as well as stretched tissues (if they are strong enough to be stretched), dried tissues and freeze-dried tissues.
  • There are of course ‘many ways to skin a cat’, but if you’re looking for a cheap relatively easy way to prepare sections with low background from the support, I have found the following works:
  • Take a glass slide
  • Cut a small piece of mylar (I use 3.6 μm thick) to the size of the slide
  • With gloves on (and I often use a kim wipe to assist) smooth the mylar down onto the slide. It should cling like cling wrap.
  • Cut your section in the cryotome
  • Pick up the section as you normally would with you mylar covered slide
  • If you need to keep it in the freezer for a while, carefully place another piece of mylar on top (I use a slightly smaller piece of mylar for the top and tape the edges)
  • When you’re ready to collect data, carefully peel the mylar sandwich (with section inside) away from the glass slide and mount in whatever frame you have available at the beamline.
2.2 Analysing the data from fibre diffraction SAXS
  • Start by opening SAXS15id, then opening your parameters file (.sax) and your log file (.log), then open the file of interest.
  • Go to save File → save profiles. I usually do one at a time so that’s N profiles to N files.
  • Once you have exported your files, you can open them in Igor
  • You can get a free demo version of Igor if you only intend to use it for a short period of time, but you will need to buy a license if you intend to use it for many months. Go to wavemetrics
  • You will need to get the IRENA and the NIST add-ons. Fist go to File → Open experiment and choose the NIST experiment file you need (e.g. SANS_Reduction_v5.1.pxt), then go to Macros → Load IRENA macros
  • Igor is now ready. I suggest you save experiment now as ready.pxp or something similar so you don’t have to waste time loading the ad-ons each time. Go to SAS → Data import export (this will open a window), select your data path, choose your file, click Qvec as column 1, Int as column 2 and Err as column 3. Click test. Click use QRS wave names, then click import.
  • In the SANS reduction box, you can plot your data in the 1D ops tab. Just pick a path and then click plot and it will bring up the plot manager. Click load data from file and pick the file of interest.
  • Alternatively, you may want to peak fit your data. Go to Analysis → Packages → Multipeak fitting. This will bring up the Fit Setup Panel. From here, choose your x (q) and y (i) waves and click Set.
  • You may be able to get a good fit automatically, so first try pressing Auto. The Auto Find Panel will pop up. Clcik on Estimate params and Find peaks. Then on Fit Setup Panel, click Graph. If the peak positions look ok, you can choose the peak fitting parameters that suit you in Fit Setup Panel and click Do Fit. Scroll through the peaks using the up and down arrow buttons on Fit Setup Panel to view the position, amplitude and width of each peak.
  • If the auto fit is no good or is missing peaks, you can add peaks manually. Click Man, then on the Manual Peaks Panel, click Insert Peak. Right click and drag to create a new peak on the graph.
  • If you need to zoom in, make sure you don’t have Insert Peak clicked, then right click and drag to create a box around the area you want to zoom in on. Left click within the box and choose Expand.
  • To zoom out, left click and choose Austoscale axes.
  • If you want to get rid of all peak fitting, delete all peaks and go to Macros → Zap fit and residuals.