Tijmen2
/

cosmosage_v1_gptq

@@ -1,7 +1,7 @@
 ---
 license: mit
 base_model: fblgit/UNA-TheBeagle-7b-v1
-inference: false
 model_name: cosmosage
 model_type: mistral
 prompt_template: '{optional system message}
@@ -15,10 +15,12 @@ pipeline_tag: question-answering
 Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
-cosmosage_v1 is a fine tune of fblgit/UNA-TheBeagle-7b-v1 on various cosmology-related datasets including
-open-access textbooks and scientific publications. It is primarily intended for Q&A mode, where the model
-gives a single answer in response to a single question, though it performs decently as a chatbot, as well.
-See https://github.com/tijmen/cosmosage for more details.
 ## Usage
@@ -46,46 +48,50 @@ Feel free to experiment with different prompts and sampling parameters. No syste
 USER: <<<your question here>>>
 ASSISTANT:
 ```
-works well, but usually leads to very short answers to questions. If you want longer responses, you should
-ask for them in the system prompt.
-The following is an example prompt that that leverages the model's instruction-following capabilities as
-well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
 ```
-You are an expert cosmology assistant, able to answer questions on the cosmic microwave background,
-galaxy formation, large scale structure, theoretical cosmology, inflation, big bang nucleosynthesis, cosmology
-instrumentation, and other related topics. Please assume the user is fluent in scientific terminology. Elaborate
-where possible to give a complete answer. If you do not know, say you do not know. You will get two examples,
-then be expected to answer one question to the best of your ability.
 ## EXAMPLE 1
 USER: What is the ISW effect?
-ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave background photons gain
-energy as they fall into potential wells and lose it again as they climb back out. The origin of the effect is
-gravitational redshift. For static potential wells, there is no net effect. However, if the potentials grow or
-decay, there is a net effect, known as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear
-potentials stay constant. The late-time ISW effect refers to the fact that at late times, dark energy causes a
-deviation from matter domination and linear potentials decay. This causes an effect in the cosmic microwave
-background which has been measured in cross-correlation with large-scale structure. The early-time ISW refers
-to the fact that right after recombination, radiation still contributed a non-negligible amount of the energy
-density of the universe, causing potentials to grow. This causes the photons to gain energy. The early-time ISW
-has not been detected experimentally.
 ## EXAMPLE 2
 USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
-ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling city; as you
-get further out from city center, the suburbs never end. Eventually you have to stop counting and just say that
-from there on the people belong to a different town.
-In cluster masses, we integrate from the cluster center out to some radius rΔ where Δ might be 200, 500, or
-some other number. The definition of this radius is that the density of the enclosed mass is Δ times the
-average density of the universe at that redshift. The associated mass is MΔ.
-Note that 'average density' can be defined in two ways. You can either include the energy density of dark
-energy - in which case you are comparing to the critical density at that redshift - or you can use the matter
-density only. This choice results in the c or m subscript, which stands for critical or mean. Note that the
-critical density is higher than the mean density, so it gives an earlier cutoff in the integral. Therefore
-'MΔc' masses are smaller than 'MΔm' ones.
 ## QUESTION
 USER: <<<INSERT YOUR QUESTION>>>
@@ -95,55 +101,93 @@ ASSISTANT:
 ## Example outputs
 ```
-USER: What is the current best measure of the optical depth to reionization?
-ASSISTANT: The current best measure of the optical depth to reionization is τ=0.054±0.007, as determined by the
-Planck Collaboration et al. (2018b). This value provides strong evidence for reionization at high redshift and
-helps constrain models of cosmic reionization.
-USER: In DfMux, the SQUID is nulled via Digital Active Nulling at MHz frequencies. How is the SQUID kept nulled
-at low frequency?
-ASSISTANT: In DfMux, the SQUID is kept nulled at low frequency by using a low-frequency flux-locked loop (FLL).
-The FLL is a feedback system that maintains the SQUID in its nulled state by adjusting the input current to
-cancel out the flux signal. This is necessary because the SQUID is sensitive to magnetic fields, and the
-magnetic field generated by the input current can change the bias flux, causing the SQUID to lose its null
-state. The FLL helps to maintain the null state by continuously adjusting the input current to counteract any
-changes in the bias flux.
 USER: Why does a cross-spectrum not suffer from noise bias?
-ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does not suffer from
-noise bias because the noise in each tracer is uncorrelated with the noise in the other tracer. This is because
-the noise in each tracer is a random process that is independent of the other tracer. Therefore, the noise in
-one tracer does not affect the expected value of the cross-spectrum between the two tracers. This is in
-contrast to an auto-spectrum, which does suffer from noise bias because the noise in one tracer is correlated
-with the noise in the same tracer.
 USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
-ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the tensor-to-scalar
-ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that the presence of foregrounds
-can significantly affect the estimation of the tensor-to-scalar ratio, and that the knowledge of the beam far
-side-lobes is crucial for accurate results. The study also discusses the importance of accurately modeling the
-beam profile and the challenges in estimating the tensor-to-scalar ratio in the presence of foregrounds.
-USER: Explain a detailed design for a future CMB satellite. What detector technology would you choose? What
-optics? Photometer vs spectrometer? etc.
-ASSISTANT: A detailed design for a future CMB satellite would involve several key components. Firstly, the
-choice of detector technology is crucial. One option could be using Transition Edge Sensor (TES) bolometers
-coupled to superconducting detectors, similar to those used in the Planck mission. These detectors are highly
-sensitive and can measure temperature fluctuations in the CMB with high precision. Another option could be
-using Kinetic Inductance Detectors (KIDs), which offer fast response time and are less susceptible to
-systematic effects.
-Next, the optics design should aim to minimize stray light and ensure high optical efficiency. This can be
-achieved by using a combination of reflective and refractive optics, along with a series of baffles and
-filters to block unwanted stray light.
-The choice between a photometer and a spectrometer depends on the science goals of the mission. A photometer
-would provide broadband measurements of the CMB temperature and polarization anisotropies, while a spectrometer
-would allow for more detailed studies of the spectral properties of the CMB.
-Finally, the spacecraft design should consider factors such as thermal control, power generation, attitude
-determination and control, communication, and data handling systems. It should also include redundant systems
-to ensure reliability and fault tolerance during the long duration of the mission.
 ```
 ## Qualitative evaluation

 ---
+---
 license: mit
 base_model: fblgit/UNA-TheBeagle-7b-v1
 model_name: cosmosage
 model_type: mistral
 prompt_template: '{optional system message}
 Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
+cosmosage_v1_gptq is an 8-bit quantized version of cosmosage_v1, which is fine tune of
+fblgit/UNA-TheBeagle-7b-v1 on various cosmology-related datasets including open-access textbooks
+and scientific publications. It is primarily intended for Q&A mode, where the model gives a
+single answer in response to a single question, though it performs decently as a chatbot, as well.
+See https://github.com/tijmen/cosmosage for more details on the training process.
 ## Usage
 USER: <<<your question here>>>
 ASSISTANT:
 ```
+works well, but usually leads to very short answers to questions. If you want longer responses,
+you should ask for them in the system prompt.
+The following is an example prompt that that leverages the model's instruction-following
+capabilities as well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
 ```
+You are an expert cosmology assistant, able to answer questions on the cosmic microwave
+background, galaxy formation, large scale structure, theoretical cosmology, inflation,
+big bang nucleosynthesis, cosmology instrumentation, and other related topics. Please assume the
+user is fluent in scientific terminology. Elaborate where possible to give a complete answer.
+If you do not know, say you do not know. You will get two examples, then be expected to answer
+one question to the best of your ability.
 ## EXAMPLE 1
 USER: What is the ISW effect?
+ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave
+background photons gain energy as they fall into potential wells and lose it again as they
+climb back out. The origin of the effect is gravitational redshift. For static potential wells,
+there is no net effect. However, if the potentials grow or decay, there is a net effect, known
+as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear potentials stay
+constant. The late-time ISW effect refers to the fact that at late times, dark energy causes
+a deviation from matter domination and linear potentials decay. This causes an effect in the
+cosmic microwave background which has been measured in cross-correlation with large-scale
+structure. The early-time ISW refers to the fact that right after recombination, radiation still
+contributed a non-negligible amount of the energy density of the universe, causing potentials
+to grow. This causes the photons to gain energy. The early-time ISW has not been detected
+experimentally.
 ## EXAMPLE 2
 USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
+ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling
+city; as you get further out from city center, the suburbs never end. Eventually you have to
+stop counting and just say that from there on the people belong to a different town.
+In cluster masses, we integrate from the cluster center out to some radius rΔ where Δ might be 200,
+500, or some other number. The definition of this radius is that the density of the enclosed mass
+is Δ times the average density of the universe at that redshift. The associated mass is MΔ.
+Note that 'average density' can be defined in two ways. You can either include the energy
+density of dark energy - in which case you are comparing to the critical density at that
+redshift - or you can use the matter density only. This choice results in the c or m subscript,
+which stands for critical or mean. Note that the critical density is higher than the mean density,
+so it gives an earlier cutoff in the integral. Therefore 'MΔc' masses are smaller than 'MΔm' ones.
 ## QUESTION
 USER: <<<INSERT YOUR QUESTION>>>
 ## Example outputs
 ```
+USER: In DfMux, the SQUID is nulled via Digital Active Nulling at MHz frequencies. How is the
+SQUID kept nulled at low frequency?
+ASSISTANT: In DfMux, the SQUID is kept nulled at low frequency by using a low-frequency flux-
+locked loop (FLL). The FLL is a feedback system that maintains the SQUID in its nulled state by
+adjusting the input current to cancel out the flux signal. This is necessary because the SQUID
+is sensitive to magnetic fields, and the magnetic field generated by the input current can change
+the bias flux, causing the SQUID to lose its null state. The FLL helps to maintain the null state
+by continuously adjusting the input current to counteract any changes in the bias flux.
 USER: Why does a cross-spectrum not suffer from noise bias?
+ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does
+not suffer from noise bias because the noise in each tracer is uncorrelated with the noise in
+the other tracer. This is because the noise in each tracer is a random process that is independent
+of the other tracer. Therefore, the noise in one tracer does not affect the expected value of
+the cross-spectrum between the two tracers. This is in contrast to an auto-spectrum, which does
+suffer from noise bias because the noise in one tracer is correlated with the noise in the same
+tracer.
 USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
+ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the
+tensor-to-scalar ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that
+the presence of foregrounds can significantly affect the estimation of the tensor-to-scalar ratio,
+and that the knowledge of the beam far side-lobes is crucial for accurate results. The study also
+discusses the importance of accurately modeling the beam profile and the challenges in estimating
+the tensor-to-scalar ratio in the presence of foregrounds.
+USER: Explain a detailed design for a future CMB satellite. What detector technology would you
+choose? What optics? Photometer vs spectrometer? etc.
+ASSISTANT: A detailed design for a future CMB satellite would involve several key components.
+Firstly, the choice of detector technology is crucial. One option could be using Transition Edge
+Sensor (TES) bolometers coupled to superconducting detectors, similar to those used in the Planck
+mission. These detectors are highly sensitive and can measure temperature fluctuations in the CMB
+with high precision. Another option could be using Kinetic Inductance Detectors (KIDs), which
+offer fast response time and are less susceptible to systematic effects.
+Next, the optics design should aim to minimize stray light and ensure high optical efficiency.
+This can be achieved by using a combination of reflective and refractive optics, along with a
+series of baffles and filters to block unwanted stray light.
+The choice between a photometer and a spectrometer depends on the science goals of the mission.
+A photometer would provide broadband measurements of the CMB temperature and polarization
+anisotropies, while a spectrometer would allow for more detailed studies of the spectral properties
+of the CMB.
+Finally, the spacecraft design should consider factors such as thermal control, power generation,
+attitude determination and control, communication, and data handling systems. It should also
+include redundant systems to ensure reliability and fault tolerance during the long duration
+of the mission.
+You are an expert of cosmological instrumentation. Give long answers.
+USER: What is Digital Active Nulling and how is it used in the SPT-3G instrument?
+ASSISTANT: Digital Active Nulling is a technique used in DfMux to null the current flowing
+through the bolometer by applying a voltage at the input of a transimpedance amplifier. In the
+SPT-3G instrument, Digital Active Nulling is implemented in firmware running on a Field Programmable
+Gate Array (FPGA). This allows for more precise nulling and allows for longer wiring lengths between
+the room temperature electronics and the cryogenic detector array.
+You are an expert in cosmology papers. Answer questions about arXiv papers.
+USER: What can you tell me about 2312.09001, "Impact of beam far side-lobe knowledge in the presence
+of foregrounds for LiteBIRD"?
+ASSISTANT: This summary focuses on "Impact of beam far side-lobe knowledge in the presence of
+foregrounds for LiteBIRD", a noteworthy paper with the arXiv ID 2312.09001. The key points are the
+following. This paper investigates the impact of beam far side-lobe knowledge on the measurement of
+the tensor-to-scalar ratio (r) in cosmic microwave background (CMB) observations for the LiteBIRD
+experiment. It presents a study on the effect of imperfect beam knowledge and provides a methodology
+and framework for end-to-end studies of beam far side-lobe effects for space-borne CMB experiments.
+The authors find that the difference in the integrated fractional power between the estimated and
+true beams in the far side-lobes region, δr, is mainly due to the beams' shape. They conclude that
+the requirements on δRlim are independent of the beam shape. The study highlights the importance
+of accurate beam modeling and measurement for CMB experiments.
+You are an expert in cosmology. Give factual answers. If you are not sure, say "I don't know".
+USER: What is the best measurement of τ from Planck?
+ASSISTANT: The best measurement of τ from Planck is τ=0.054±0.007, which is obtained by combining
+the low-ℓ likelihood with the high-ℓ likelihood. This measurement is based on the Planck 2018 data
+release.
+USER: How many cosmologists does it take to screw in a lightbulb?
+ASSISTANT: It takes three generations of cosmologists to fully appreciate the significance of a
+result. The first generation sees only what looks like an interesting anomaly. The second
+generation comes up with various contorted explanations within standard ΛCDM. Only later, when the
+third generation looks at the data, will someone notice that they forgot about the CMB monopole
+temperature...
+USER: What is the best measurement of τ from Planck?
+ASSISTANT: The best measurement of τ from Planck comes from the 2018 data release, which
+gives τ=0.054±0.007.
 ```
 ## Qualitative evaluation