Update README.md
Browse files
README.md
CHANGED
|
@@ -1,7 +1,7 @@
|
|
| 1 |
---
|
|
|
|
| 2 |
license: mit
|
| 3 |
base_model: fblgit/UNA-TheBeagle-7b-v1
|
| 4 |
-
inference: false
|
| 5 |
model_name: cosmosage
|
| 6 |
model_type: mistral
|
| 7 |
prompt_template: '{optional system message}
|
|
@@ -15,10 +15,12 @@ pipeline_tag: question-answering
|
|
| 15 |
|
| 16 |
Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
|
|
|
|
|
|
| 22 |
|
| 23 |
## Usage
|
| 24 |
|
|
@@ -46,46 +48,50 @@ Feel free to experiment with different prompts and sampling parameters. No syste
|
|
| 46 |
USER: <<<your question here>>>
|
| 47 |
ASSISTANT:
|
| 48 |
```
|
| 49 |
-
works well, but usually leads to very short answers to questions. If you want longer responses,
|
| 50 |
-
ask for them in the system prompt.
|
| 51 |
|
| 52 |
-
The following is an example prompt that that leverages the model's instruction-following
|
| 53 |
-
well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
|
| 54 |
|
| 55 |
```
|
| 56 |
-
You are an expert cosmology assistant, able to answer questions on the cosmic microwave
|
| 57 |
-
galaxy formation, large scale structure, theoretical cosmology, inflation,
|
| 58 |
-
instrumentation, and other related topics. Please assume the
|
| 59 |
-
|
| 60 |
-
|
|
|
|
| 61 |
|
| 62 |
|
| 63 |
## EXAMPLE 1
|
| 64 |
USER: What is the ISW effect?
|
| 65 |
-
ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave
|
| 66 |
-
energy as they fall into potential wells and lose it again as they
|
| 67 |
-
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
|
| 71 |
-
|
| 72 |
-
|
| 73 |
-
|
| 74 |
-
|
|
|
|
|
|
|
| 75 |
|
| 76 |
## EXAMPLE 2
|
| 77 |
USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
|
| 78 |
-
ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling
|
| 79 |
-
get further out from city center, the suburbs never end. Eventually you have to
|
| 80 |
-
from there on the people belong to a different town.
|
| 81 |
-
|
| 82 |
-
|
| 83 |
-
|
| 84 |
-
|
| 85 |
-
|
| 86 |
-
density
|
| 87 |
-
|
| 88 |
-
|
|
|
|
| 89 |
|
| 90 |
## QUESTION
|
| 91 |
USER: <<<INSERT YOUR QUESTION>>>
|
|
@@ -95,55 +101,93 @@ ASSISTANT:
|
|
| 95 |
## Example outputs
|
| 96 |
|
| 97 |
```
|
| 98 |
-
USER:
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
| 105 |
-
|
| 106 |
-
The FLL is a feedback system that maintains the SQUID in its nulled state by adjusting the input current to
|
| 107 |
-
cancel out the flux signal. This is necessary because the SQUID is sensitive to magnetic fields, and the
|
| 108 |
-
magnetic field generated by the input current can change the bias flux, causing the SQUID to lose its null
|
| 109 |
-
state. The FLL helps to maintain the null state by continuously adjusting the input current to counteract any
|
| 110 |
-
changes in the bias flux.
|
| 111 |
|
| 112 |
USER: Why does a cross-spectrum not suffer from noise bias?
|
| 113 |
-
ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does
|
| 114 |
-
noise bias because the noise in each tracer is uncorrelated with the noise in
|
| 115 |
-
the noise in each tracer is a random process that is independent
|
| 116 |
-
one tracer does not affect the expected value of
|
| 117 |
-
|
| 118 |
-
with the noise in the same
|
|
|
|
| 119 |
|
| 120 |
USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
|
| 121 |
-
ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the
|
| 122 |
-
ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that
|
| 123 |
-
can significantly affect the estimation of the tensor-to-scalar ratio,
|
| 124 |
-
side-lobes is crucial for accurate results. The study also
|
| 125 |
-
|
| 126 |
-
|
| 127 |
-
|
| 128 |
-
|
| 129 |
-
|
| 130 |
-
|
| 131 |
-
|
| 132 |
-
|
| 133 |
-
|
| 134 |
-
|
| 135 |
-
|
| 136 |
-
|
| 137 |
-
|
| 138 |
-
|
| 139 |
-
|
| 140 |
-
|
| 141 |
-
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
```
|
| 148 |
|
| 149 |
## Qualitative evaluation
|
|
|
|
| 1 |
---
|
| 2 |
+
---
|
| 3 |
license: mit
|
| 4 |
base_model: fblgit/UNA-TheBeagle-7b-v1
|
|
|
|
| 5 |
model_name: cosmosage
|
| 6 |
model_type: mistral
|
| 7 |
prompt_template: '{optional system message}
|
|
|
|
| 15 |
|
| 16 |
Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.
|
| 17 |
|
| 18 |
+
cosmosage_v1_gptq is an 8-bit quantized version of cosmosage_v1, which is fine tune of
|
| 19 |
+
fblgit/UNA-TheBeagle-7b-v1 on various cosmology-related datasets including open-access textbooks
|
| 20 |
+
and scientific publications. It is primarily intended for Q&A mode, where the model gives a
|
| 21 |
+
single answer in response to a single question, though it performs decently as a chatbot, as well.
|
| 22 |
+
|
| 23 |
+
See https://github.com/tijmen/cosmosage for more details on the training process.
|
| 24 |
|
| 25 |
## Usage
|
| 26 |
|
|
|
|
| 48 |
USER: <<<your question here>>>
|
| 49 |
ASSISTANT:
|
| 50 |
```
|
| 51 |
+
works well, but usually leads to very short answers to questions. If you want longer responses,
|
| 52 |
+
you should ask for them in the system prompt.
|
| 53 |
|
| 54 |
+
The following is an example prompt that that leverages the model's instruction-following
|
| 55 |
+
capabilities as well as two-shot prompting. I've been using it with temperature=0.4, min_p=0.1.
|
| 56 |
|
| 57 |
```
|
| 58 |
+
You are an expert cosmology assistant, able to answer questions on the cosmic microwave
|
| 59 |
+
background, galaxy formation, large scale structure, theoretical cosmology, inflation,
|
| 60 |
+
big bang nucleosynthesis, cosmology instrumentation, and other related topics. Please assume the
|
| 61 |
+
user is fluent in scientific terminology. Elaborate where possible to give a complete answer.
|
| 62 |
+
If you do not know, say you do not know. You will get two examples, then be expected to answer
|
| 63 |
+
one question to the best of your ability.
|
| 64 |
|
| 65 |
|
| 66 |
## EXAMPLE 1
|
| 67 |
USER: What is the ISW effect?
|
| 68 |
+
ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave
|
| 69 |
+
background photons gain energy as they fall into potential wells and lose it again as they
|
| 70 |
+
climb back out. The origin of the effect is gravitational redshift. For static potential wells,
|
| 71 |
+
there is no net effect. However, if the potentials grow or decay, there is a net effect, known
|
| 72 |
+
as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear potentials stay
|
| 73 |
+
constant. The late-time ISW effect refers to the fact that at late times, dark energy causes
|
| 74 |
+
a deviation from matter domination and linear potentials decay. This causes an effect in the
|
| 75 |
+
cosmic microwave background which has been measured in cross-correlation with large-scale
|
| 76 |
+
structure. The early-time ISW refers to the fact that right after recombination, radiation still
|
| 77 |
+
contributed a non-negligible amount of the energy density of the universe, causing potentials
|
| 78 |
+
to grow. This causes the photons to gain energy. The early-time ISW has not been detected
|
| 79 |
+
experimentally.
|
| 80 |
|
| 81 |
## EXAMPLE 2
|
| 82 |
USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
|
| 83 |
+
ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling
|
| 84 |
+
city; as you get further out from city center, the suburbs never end. Eventually you have to
|
| 85 |
+
stop counting and just say that from there on the people belong to a different town.
|
| 86 |
+
|
| 87 |
+
In cluster masses, we integrate from the cluster center out to some radius rΔ where Δ might be 200,
|
| 88 |
+
500, or some other number. The definition of this radius is that the density of the enclosed mass
|
| 89 |
+
is Δ times the average density of the universe at that redshift. The associated mass is MΔ.
|
| 90 |
+
Note that 'average density' can be defined in two ways. You can either include the energy
|
| 91 |
+
density of dark energy - in which case you are comparing to the critical density at that
|
| 92 |
+
redshift - or you can use the matter density only. This choice results in the c or m subscript,
|
| 93 |
+
which stands for critical or mean. Note that the critical density is higher than the mean density,
|
| 94 |
+
so it gives an earlier cutoff in the integral. Therefore 'MΔc' masses are smaller than 'MΔm' ones.
|
| 95 |
|
| 96 |
## QUESTION
|
| 97 |
USER: <<<INSERT YOUR QUESTION>>>
|
|
|
|
| 101 |
## Example outputs
|
| 102 |
|
| 103 |
```
|
| 104 |
+
USER: In DfMux, the SQUID is nulled via Digital Active Nulling at MHz frequencies. How is the
|
| 105 |
+
SQUID kept nulled at low frequency?
|
| 106 |
+
ASSISTANT: In DfMux, the SQUID is kept nulled at low frequency by using a low-frequency flux-
|
| 107 |
+
locked loop (FLL). The FLL is a feedback system that maintains the SQUID in its nulled state by
|
| 108 |
+
adjusting the input current to cancel out the flux signal. This is necessary because the SQUID
|
| 109 |
+
is sensitive to magnetic fields, and the magnetic field generated by the input current can change
|
| 110 |
+
the bias flux, causing the SQUID to lose its null state. The FLL helps to maintain the null state
|
| 111 |
+
by continuously adjusting the input current to counteract any changes in the bias flux.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 112 |
|
| 113 |
USER: Why does a cross-spectrum not suffer from noise bias?
|
| 114 |
+
ASSISTANT: A cross-spectrum between two different tracers of the same underlying field does
|
| 115 |
+
not suffer from noise bias because the noise in each tracer is uncorrelated with the noise in
|
| 116 |
+
the other tracer. This is because the noise in each tracer is a random process that is independent
|
| 117 |
+
of the other tracer. Therefore, the noise in one tracer does not affect the expected value of
|
| 118 |
+
the cross-spectrum between the two tracers. This is in contrast to an auto-spectrum, which does
|
| 119 |
+
suffer from noise bias because the noise in one tracer is correlated with the noise in the same
|
| 120 |
+
tracer.
|
| 121 |
|
| 122 |
USER: Summarize "Impact of beam far side-lobe knowledge in the presence of foregrounds for LiteBIRD".
|
| 123 |
+
ASSISTANT: This paper investigates the impact of beam far side-lobes on the estimation of the
|
| 124 |
+
tensor-to-scalar ratio in the presence of foregrounds for the LiteBIRD mission. The authors find that
|
| 125 |
+
the presence of foregrounds can significantly affect the estimation of the tensor-to-scalar ratio,
|
| 126 |
+
and that the knowledge of the beam far side-lobes is crucial for accurate results. The study also
|
| 127 |
+
discusses the importance of accurately modeling the beam profile and the challenges in estimating
|
| 128 |
+
the tensor-to-scalar ratio in the presence of foregrounds.
|
| 129 |
+
|
| 130 |
+
USER: Explain a detailed design for a future CMB satellite. What detector technology would you
|
| 131 |
+
choose? What optics? Photometer vs spectrometer? etc.
|
| 132 |
+
ASSISTANT: A detailed design for a future CMB satellite would involve several key components.
|
| 133 |
+
Firstly, the choice of detector technology is crucial. One option could be using Transition Edge
|
| 134 |
+
Sensor (TES) bolometers coupled to superconducting detectors, similar to those used in the Planck
|
| 135 |
+
mission. These detectors are highly sensitive and can measure temperature fluctuations in the CMB
|
| 136 |
+
with high precision. Another option could be using Kinetic Inductance Detectors (KIDs), which
|
| 137 |
+
offer fast response time and are less susceptible to systematic effects.
|
| 138 |
+
|
| 139 |
+
Next, the optics design should aim to minimize stray light and ensure high optical efficiency.
|
| 140 |
+
This can be achieved by using a combination of reflective and refractive optics, along with a
|
| 141 |
+
series of baffles and filters to block unwanted stray light.
|
| 142 |
+
|
| 143 |
+
The choice between a photometer and a spectrometer depends on the science goals of the mission.
|
| 144 |
+
A photometer would provide broadband measurements of the CMB temperature and polarization
|
| 145 |
+
anisotropies, while a spectrometer would allow for more detailed studies of the spectral properties
|
| 146 |
+
of the CMB.
|
| 147 |
+
|
| 148 |
+
Finally, the spacecraft design should consider factors such as thermal control, power generation,
|
| 149 |
+
attitude determination and control, communication, and data handling systems. It should also
|
| 150 |
+
include redundant systems to ensure reliability and fault tolerance during the long duration
|
| 151 |
+
of the mission.
|
| 152 |
+
|
| 153 |
+
You are an expert of cosmological instrumentation. Give long answers.
|
| 154 |
+
USER: What is Digital Active Nulling and how is it used in the SPT-3G instrument?
|
| 155 |
+
ASSISTANT: Digital Active Nulling is a technique used in DfMux to null the current flowing
|
| 156 |
+
through the bolometer by applying a voltage at the input of a transimpedance amplifier. In the
|
| 157 |
+
SPT-3G instrument, Digital Active Nulling is implemented in firmware running on a Field Programmable
|
| 158 |
+
Gate Array (FPGA). This allows for more precise nulling and allows for longer wiring lengths between
|
| 159 |
+
the room temperature electronics and the cryogenic detector array.
|
| 160 |
+
|
| 161 |
+
You are an expert in cosmology papers. Answer questions about arXiv papers.
|
| 162 |
+
USER: What can you tell me about 2312.09001, "Impact of beam far side-lobe knowledge in the presence
|
| 163 |
+
of foregrounds for LiteBIRD"?
|
| 164 |
+
ASSISTANT: This summary focuses on "Impact of beam far side-lobe knowledge in the presence of
|
| 165 |
+
foregrounds for LiteBIRD", a noteworthy paper with the arXiv ID 2312.09001. The key points are the
|
| 166 |
+
following. This paper investigates the impact of beam far side-lobe knowledge on the measurement of
|
| 167 |
+
the tensor-to-scalar ratio (r) in cosmic microwave background (CMB) observations for the LiteBIRD
|
| 168 |
+
experiment. It presents a study on the effect of imperfect beam knowledge and provides a methodology
|
| 169 |
+
and framework for end-to-end studies of beam far side-lobe effects for space-borne CMB experiments.
|
| 170 |
+
The authors find that the difference in the integrated fractional power between the estimated and
|
| 171 |
+
true beams in the far side-lobes region, δr, is mainly due to the beams' shape. They conclude that
|
| 172 |
+
the requirements on δRlim are independent of the beam shape. The study highlights the importance
|
| 173 |
+
of accurate beam modeling and measurement for CMB experiments.
|
| 174 |
+
|
| 175 |
+
You are an expert in cosmology. Give factual answers. If you are not sure, say "I don't know".
|
| 176 |
+
USER: What is the best measurement of τ from Planck?
|
| 177 |
+
ASSISTANT: The best measurement of τ from Planck is τ=0.054±0.007, which is obtained by combining
|
| 178 |
+
the low-ℓ likelihood with the high-ℓ likelihood. This measurement is based on the Planck 2018 data
|
| 179 |
+
release.
|
| 180 |
+
|
| 181 |
+
USER: How many cosmologists does it take to screw in a lightbulb?
|
| 182 |
+
ASSISTANT: It takes three generations of cosmologists to fully appreciate the significance of a
|
| 183 |
+
result. The first generation sees only what looks like an interesting anomaly. The second
|
| 184 |
+
generation comes up with various contorted explanations within standard ΛCDM. Only later, when the
|
| 185 |
+
third generation looks at the data, will someone notice that they forgot about the CMB monopole
|
| 186 |
+
temperature...
|
| 187 |
+
|
| 188 |
+
USER: What is the best measurement of τ from Planck?
|
| 189 |
+
ASSISTANT: The best measurement of τ from Planck comes from the 2018 data release, which
|
| 190 |
+
gives τ=0.054±0.007.
|
| 191 |
```
|
| 192 |
|
| 193 |
## Qualitative evaluation
|